1. Is it possible to replace team members after submission?
2. Is it possible to have more than 6 people in the group?
– The submission proposal can have more than 6 members, but for the competition only 6 team members can compete.
3. Can we get formal invitation letter from the competition, we need it in order to get the VISA to Germany.
– Yes, please email us and we will generate it for you.
4. For the final architecture\’ paper, will we able to install/use the additional software/libraries after submit?
5. Is it acceptable to manipulate hardware\’s software configuration after the benchmark runs, so long as the system is not rebooted and nothing is physically changed on the system?
– The servers needs to be powered on all the time. No HW changes are allowed, you can decide not to use specific server in a test.
6. What can I expect regarding students accommodation, meals, etc.?
– We don’t supply accomodation for the students, you will need to work this out your own. Meals may be provided during the competition.
Other info can be found in https://www.isc-hpc.com/travel-stay.html
7. As reading from the website(http://hpcadvisorycouncil.com/events/student-cluster-competition/rules/): Teams might need to prepare an adapter to convert from the TypeF (Schuko) wall power to the C20 socket for the input of the PDU.
Does it mean we need to bring PDU cable from wall plug (type F) to the APC PDU?
– Team booths include (1) power source, (1) APC AP8858 PDU, and (1) Tripp Lite Schuko-to-C19 cable for the PDU. The PDU will be plugged into one of the available receptacle of the power source. Additional Schuko power strips for plugging in equipment (such as laptops, or non-competition systems) will be provided by ISC Events.
8. Apart from physically disconnecting and removing components on a system, or turning them off through the BIOS (both of which are presumably not allowed), one can also conserve energy by using scripts to adjust system component configurations like GPU/CPU clock frequency. Is it acceptable to make this sort of changes (by running scripts) during the competition, after the benchmark runs?
– Tuning is ok. We don’t allow changing the HW or resetting the HW after first submission.
1. Can the HPL binary from the HPCC application can be substituted by the NVIDIA optimized binary?
– It’s possible, and may help to the utilized better the GPUs for HPCC. However, we need to understand the exact code changes, to make sure it is valid.
2. We would like to implement an optimization on the HPCG. With the OptimizeProblem() it is possible to allocate memory only at the beginning of the problem, however, it is not possible to do any memory management at the beginning and end of every CG iteration. Are we allowed to modify CG.cpp for this purpose? (algorithm remains the same)
– Yes we allow changes to the code for various reasons. Sometimes if the group has some special hardware (like GPU) and they want to offload the compute intensive routine for their hardware, the team can work on making changes to the code instead of using the reference implementation, as long as their implementation doesn’t change the logic of the application/algorithm. In your case, it seems to be ok.
In any case, you will need to show us the changes of the code so we could review, and supply description of the changes to help us with that.
3. As for HPCC – is there a limit for how much code allowable to be optimised? (We can demonstrate that we didn’t change the algorithm)
– For code change you will need to show us your modifications in HPCC/other codes, we will need to review the changes and evaluate.
4. May I know if all nodes are required to participates in benchmarking? We can have the all nodes connected to power supply, and will not change the hardware configuration after benchmarking.
– For hardware, after submitting the first set of results, that would be the hardware they would need to use throughout the competition. Such that if you use 8 nodes in the first submission (say for LINPACK), they will need to keep using 8 nodes throughout even if you use less, or want to use more/others.
5. What will the organizer provide in ISC Student competition booth? For example Racks, table and chairs?
– We provide booth, table and chairs, power connections, internet connection. We don’t provide server racks.
6. For HPCC, In DGEMM and FFT can we use GPU for computation?
– Yes, it is possible, but you team will need to explain what are the changes so we can review.
7. Would 7 tests of HPCC have equal weight in the score, or will you focus on the result of HPL in HPCC only?
– There are subtests within HPCC. Some, but not all, of the results from the subtests will be used to determine the HPCC score. Each of the scores from those subtests will be weighed equally.
1. According to http://hpcadvisorycouncil.com/events/student-cluster-competition/benchmarking/ , “Time to solution” results will be used as a reference from the log file. We found that “Time to Solution” only exists in Test_compressed_lanczos_hot_start but Benchmark_ITT doesn’t call this test. We tried to run Test_compressed_lanczos_hot_start directly but it requires a file Params.xml. Could you please provide a sample of this file for reference? And which one do we run during the competition, Benchmark_ITT or Test_compressed_lanczos_hot_start?
– We don’t have currently param.xml file to share, to tune the application you can use Benchmark_ITT tests that doesn’t require input file.
2. There are three types of drivers that can output total time, may I know which one will be used?
– The “Total Computation time” is the number of seconds to be used, reported by the application.
3. How do we calculate the GRID results
For your testing you can be looking for the “result” field after the “Comparision point” in the output.
— This line —> Comparison point result: 79826.6 Mflop/s per node
Comparison point is 0.5*(85631.9+74021.3)
Comparison point robustness: 0.819
For the competition, we will supply input file and give you the instructions how to test. we will be looking at the “Time to solution” output.
4. About Grid, do we have to use either single or double precision, or that’s not relevant?
– This is not relevant for us. we will use only single precision. this is just an example.
5. GRID version to be used is ISC-freeze-2 tag
1. Will we be given input file for Nektar++? If so, will we be able to change the entries on the input file?
– Yes, we will give you input file in the day of the competition. No changes can be done.
2. What is the judging criteria for Nektar++? Are we looking at both error and computational time?
– The “Total Computation Time” parameter in the output.
3. For all applications of this year you ask either a given version or that no specific version is required. However for Nektar++ no requirement is mentioned.
Is it safe to use the latest stable version (4.4.1)?
– It is safe to experiment and learn using a stable version but, for the competition, we will give the teams the exact version to use based on the master.
4. Could you please clarify which of the following optional packages for Netkar++ are required for the competition?
List of packages available at User Guide found in the website (http://www.nektar.info/downloads/file/user-guide-pdf-3/)
ARPACK > 2.0 (for arnoldi algorithms)
PETSc (Alternative linear solvers)
Scotch (Alternative mesh partitioning)
VTK > 5.8 (Visualization utilities)
– There is no visualization task this year, VTK may be skipped. But the solvers it is better to have all of them ready.
5. As explained in the tutorial section (https://www.nektar.info/community/tutorials/), the Nektar++ framework goes through different steps such as:
Generating a problem mesh
Configuring the problem definitions
Running the solver (yields the “Total Computation time”)
Which of these steps will be required during the competition?
– We measure the “Total Computation time” of the solver to rank the teams. We will supply the input file, you won’t be needing to perform pre-processing and generate a problem or perform post processing, only running the solver. You can do it for your own practice.
6. Are there any example input files for Nektar++? We want to find out the hotspot but we don\’t have any idea about the workload in competition.
– We don’t supply examples, you can review the examples from the Nektar web page.
7. Do you fix the version of Nektar++ benchmark?
– We found several issues with the last 4.4.1 version, we will supply a version of Nektar in the competition day.
8. As there are many types of solvers and mesh utilities, would it be possible to let all of us know a subset of solvers that may be used in competition?
– There are 13 solvers in Nektar, we will reveal the solver to be used in the competition day.
1. Can I use any version of TensorFlow?
– Any general available version can be used, changes can be done. The differences/patches done should be sent to us for review.
2. Will we be able to change the learning rate without adhering to the limitations on those provided by benchmark_cnn.py?
– Benchmark_cnn.py is the core benchmark script, no changes beyond available parameters available is allowed; unless it, or any dependencies of benchmark_cnn.py is required to work-around any known issues/bugs.
3. Will we be allowed to use FP16 during the training process while the final variables have FP32
– FP16 can be used for training; beyond the train phase; there will be no inferencing against a trained model. We are scoring on throughput, accuracy and loss.
4. Will we be able to change the last 2 layers of the VGG16 network to fully connected convolutions to make use of existing optimizations of convolutions for tensor cores?
– You cannot change the characteristics or layers of VGG16.
5. I realised that there is a image pre-processing for imagenet recommended by benchmark_cnn.py however the image pre-processor is for resnet, will it be possible to implement our own image pre-processing that follows the training method as recommended by the vgg paper or do we have to use the image pre-processing that came with benchmark_cnn
– This is the link you need to use: https://github.com/tensorflow/models/tree/master/research/inception#getting-started
# build the preprocessing script.
bazel build //inception:download_and_preprocess_imagenet
# run it
6. I have downloaded and processed the image already. What I meant is that there is an input_preprocessor flag in benchmark_cnn.py can it be changed to use a different input pre-processor?
7. I have downloaded and processed the image already. What I meant is that there is an input_preprocessor flag in benchmark_cnn.py can it be changed to use a different input pre-processor?
– Following the guidance set here :
https://www.tensorflow.org/performance/performance_guide and with the parameters available from the benchmark : https://github.com/tensorflow/benchmarks/blob/master/scripts/tf_cnn_benchmarks/benchmark_cnn.py
You are restricted from introducing changes to the base VGG16 model (i.e. no additional layers in the training that cannot be verified without inferencing). Most of the preprocessing for this benchmark, is done by way of download_and_preprocess_imagenet already, but there are several optimization methods that are tunable based on architecture choices.
8. Are we allowed to use high level interfaces to tensorflow, for example Keras ( https://keras.io )?
– Yes, keras is allowed.
9. I would like to use official_model_imagenet input_preprocessor provided by tensorflow/models/official/resnet/imagenet_preprocessor.
Would that be alright?
– Yes. this is fine.
10. Is there any new dataset provided to be run ,if so, does the pre-processing time count in the training time? 4. Can we use pre-train model in the competition?
– You will need to use the imagenet 2012
http://www.image-net.org/challenges/LSVRC/2012/nonpub-downloads and run the pre-processing stage, before the competition.
It is all documented in the Benchmarking section:
11. Can we use pre-train model or we need to run from scratch? Is there a new data set in the competition day, if so that, the time for preprocessing is included in running time?
– You cannot use the pre-train model, you need only to perform the pre-processing to the images (change file names). There is no new set of images in the competition day, we will work on the imagenet 2012. The pre-processing time is not included, and should be done prior to the competition day (if possible). if you only get your servers in the competition day, try to do it before the competition starts. Make sure you can download the imagenet, or bring it with you maybe on flash drive.
12. Are we allowed to use the TensorRT (https://developer.nvidia.com/tensorrt) Software from Nvidia?
– No, TensorRT is a DL inference optimizer. We are not doing inferencing.
13. Will we be evaluated by final validation accuracy or training accuracy?
– There is no difference between them.
14. What does the optimization methods in TensorFlow specifically refer to?
– Changes to the OpenFlow framework are allowed. You can use RDMA, parallel the work on several servers. Without changing the input data or model used.
15. Can we create our own preprocessor or modify the preprocessor (https://github.com/tensorflow/benchmarks/blob/master/scripts/tf_cnn_benchmarks/preprocessing.py) in order to add more feature such as rotating?
– You can, but you cannot change the attributes of VGG standard image sizing, or the VGG model.
16. Will some other variable update method, except for parameter_server, replicated, distributed_replicated or independent, with additional support are allowed in AI problem?
– That the benchmark scripts are used and must stay within the usage guidelines.
17. What’s the detailed rules for scoring the AI application, or to be announced in the competition day? In other word, what exactly “We are scoring on throughput, accuracy and loss.” is? Besides, is any kind of change to the VGG16 model code is allowed? We don’t change the architecture (network layers and connections), but how about the implementation of kernel?
– You will asked to demonstrate maximum accuracy within a fixed training time (to be announced at the competition, around 2h) using ImageNet 2012
We will check the level of accuracy you got.
The model code should not be changed. You can change the framework of tensorflow, communication layers, the usage of RDMA for example, the usage of GPUs and so on.
18. Ee have encountered a bug similar to https://github.com/tensorflow/tensorflow/issues/658, involving tensorflow and probably the current nvidia drivers. We could find that we can circumvent the bug by using horovod ( https://github.com/uber/horovod ). Therefore our question: Are we allowed to use horovod for the competition?
– Horovod is ok to be used.
19. During the competition do we have to complete the training within one session of benchmarking or can we split into multiple small training sessions. In addition, our training reaches NaN losses sometimes during the training, we are unsure if it could be due to an error in the dataset, could we request dataset during the competition?
– Did you pre-process the data?
We will have the data available from flash drive (~150GB)
The objectives are not to complete the training, so you may finish only one session.
20. Should we run tf_cnn_benchmark.py in “benchmark” or “eval” mode?
–[no]eval: whether use eval or benchmarking (default: ‘false’)
That parameter should remain default.
Submit your question here:
To submit questions about the ISC Student Cluster Competition, please fill out the following details.
*Note all fields required.