1 - Overview and Objective
We summarize the Overview and objective of the Working Group
Encourage and support the curation of large-scale experimental and
scientific datasets and the engineering of ML benchmarks operating on
those datasets.
The WG will engage with scientists, academics,
national laboratories, such as synchrotrons, in securing, engineering,
curating, and publishing datasets and machine learning benchmarks that
operate on experimental scientific datasets. This will entail working
across different domains of sciences, including material, life,
environmental, and earth sciences, particle physics, and astronomy, to
mention a few. We will include traditional observational and
computer-generated data.
Although scientific data is widespread, curating, maintaining, and
distributing large-scale, useful datasets for public consumption is a
challenging process, covering various aspects of data (from FAIR
principles to distribution to versioning). With large data products,
various ML techniques have to be evaluated against different
architectures and different datasets. Without these benchmarking
efforts, the community has no clear pathway for utilizing these
advanced models. We expect that the collection will have significant
tutorial value as examples from one field, and one observational or
computational experiment can be modified to advance other fields and
experiments.
The working group’s goal is to assemble and distribute scientific data
sets relevant to a scientific campaign in a systematic manner, and
pose quantifiable targets (“science benchmark”). A benchmark involves
- (i) a data set,
- (ii) objective criteria to meet, and
- (iii) an example implementation.
The objective criteria depends on the scientific
problem at hand. The metric should be well defined on the data but
could come from a diverse set of measures (one or more of: accuracy
targets, top-1 or 5% error, time to convergence, cross-validation
rates, confusion matrices, type-1/type-2 error rates, inference times,
surrogate accuracy, control stability measure, etc.).
2.3 - TEvolOp Earthquake Forecasting
Forcasting Earthquakes
TEvolOp Earthquake Forecasting
Time series are seen in many scientific problems and many of them are
geospatial – functions of space and time and this benchmark illustrates
this type. Some time series have a clear spatial structure that for
example strongly relates nearby space points. The problem chosen is
termed a spatial bag where there is spatial variation but it is not
clearly linked to the geometric distance between spatial regions. In
contrast, traffic-related time series have a strong spatial structure.
We intend benchmarks that cover a broad range of problem types.
The earthquake data comes from USGS and we have chosen a 4 degrees of
Latitude (32 to 36 N) and 6 degrees of Longitude (-120 to -114) region
covering Southern California. The data runs from 1950 to the present day
and is presented as events: magnitude, ground location, depth, and time.
We have divided the data into time and space bins. The time interval is
daily but in our reference models, we accumulate this into fortnightly
data. Southern California is divided into a 40 by 60 grid of 0.1 by
0.1-degree “pixels” which corresponds roughly to squares with an 11 km
side, The dataset also includes an assignment of pixels to known faults
and a list of the largest earthquakes in that region from 1950 until
today. We have chosen various samplings of the dataset to provide both
input and predicted values. These include time ranges from a fortnight
up to 4 years. Further, we calculate summed magnitudes and depths and
counts of significant quakes (magnitude > 3.29). Other easily available
quantities are powers of quake energy (using Energy ~ 101.5m where m is
magnitude). Quantities are “Energy averaged” when there are multiple
events in a single space-time bin except for simple event counts.
Current reference models are a basic LSTM recurrent neural network and a
modification of the original science transformer. Details can be found
here,
and
here.
TEvolOp Specific Benchmark Targets
- Scientific objective(s):
- Objective: Improve the quality of Earthquake forecasting
- Formula: Normalized Nash–Sutcliffe model efficiency coefficient
(NNSE)
- Score: The NNSE lies between 0.8 and 0.99 depending on model and
predicted time series
- Data
- Download:
https://drive.google.com/drive/folders/1wz7K2R4gc78fXLNZMHcaSVfQvIpIhNPi?usp=sharing
- Data Size: 5GB from USGS
- Training samples: Data is decided spatially in an 80%-20%
fashion between training and validation. The full dataset covers
6 degrees of longitude (-114 to -120) and 4 degrees of latitude
(32 to 56) In Southern California. This is divided into 2400
spatial bins 0.1 degree (~11km) on a side
- Validation samples: Most analyses use 500 most active bins of
which 400 are training and 100 validation.
- Example implementation
Example Implementation:
The example implementation is primarily to demonstrate feasibility, show
how the data is represented, help address any interpretation
considerations, and potentially trigger initial ideas on how the
benchmark can be improved.
2.4 - STEMDL (Classification)
State of the art scanning transmission electron microscopes (STEM) produce focused electron beams with atomic dimensions and allow to capture diffraction patterns arising from the interaction of incident electrons with nanoscale material volumes.
State of the art scanning transmission electron microscopes (STEM)
produce focused electron beams with atomic dimensions and allow to
capture diffraction patterns arising from the interaction of incident
electrons with nanoscale material volumes.
STEMDL (Classification)
State of the art scanning transmission electron microscopes (STEM)
produce focused electron beams with atomic dimensions and allow to
capture diffraction patterns arising from the interaction of incident
electrons with nanoscale material volumes. Backing out the local atomic
structure of said materials requires compute- and time-intensive
analyses of these diffraction patterns (known as convergent beam
electron diffraction, CBED). Traditional analyses of CBED requires
iterative numerical solutions of partial differential equations and
comparison with experimental data to refine the starting material
configuration. This process is repeated anew for every newly acquired
experimental CBED pattern and/or probed material.
In this benchmark, we
used newly developed multi-GPU and multi-node electron scattering
simulation codes [1] on
the Summit supercomputer to generate CBED patterns from over 60,000
materials (solid-state materials), representing nearly every known
crystal structure. A scaled-down version of this data
[2] is used for one of the data
challenges [3]
at SMC 2020 conference, and the overarching goals are to: (1) explore
the suitability of machine learning algorithms in the advanced analysis
of CBED and (2) produce a machine learning algorithm capable of
overcoming intrinsic difficulties posed by scientific datasets.
A data sample from this data set
is given by a 3-d array formed by stacking various CBED patterns
simulated from the same material at different distinct material
projections (i.e. crystallographic orientations). Each CBED pattern is a
2-d array with float 32-bit image intensities. Associated with each data
sample in the data set is a host of material attributes or properties
which are, in principle, retrievable via analysis of this CBED stack. Of
note are (1) 200 crystal space groups out of 230 unique mathematical
discrete space groups and (2) local electron density which governs
material’s property.
This benchmark consists of 2 tasks: classification for crystal space
groups and reconstruction for local electron density, the example
implementation of which are provided in
[4]
and [5].
STEMDL Specific Benchmark Targets
- Scientific objective(s):
- Objective: Classification for crystal space groups
- Formula: F1 score on validation data
- Score: 0.9 considered converged
- Data
- Example implementation
5.1 - Setting up Environment from Scratch
A procedure to build an optimized python from source and setup a development environment to run benchmarks.
A description on how to install nvcc in cuda
Requirements
Draft
Introduction
Most modern linux systems come prepackaged with a version of Python 3.
However, this version is typically deeply integrated into the operating system’s ecosystem of tools, so it may be a significantly older version of python and it may lack some optimizations to maximize compatibility.
For benchmarking, it is desireable to have control over your source program, so that running programs are both consistent and repeatable.
Below are the steps to build Python 3.10.2 on a variety of hosts.
Setup
Configurations
This procedure assumes the following:
- You are building using
bash
- You have
curl
, make
, gcc
, openssl
, bzip2
, libffi
, ‘zlib
, readline
, sqlite3
, llvm
, ncurses
, and xz
c header files installed.
- You have set the following environment variables
BASE
- Specifies the working directory for all operations. This procedure assumes ~/.local
PREFIX
- Where you want the final python instance to be positioned. This procedure assumes ${BASE}/python/3.10.2
.
Build OpenSSL
# Fetch source code
curl -OL https://www.openssl.org/source/openssl-1.1.1m.tar.gz
tar -zxvf openssl-1.1.1m.tar.gz -C ${BASE}/src/
cd ${BASE}/src/openssl-1.1.1m/
./config --prefix=${BASE}/ssl --openssldir=${BASE}/ssl shared zlib
make
#make test
make instal
make clean
Build Python
curl -OL https://www.python.org/ftp/python/3.10.2/Python-3.10.2.tar.xz
tar Jxvf Python-3.10.2.tar.xz -C ${BASE}/src/
cd Python-3.10.2
export CPPFLAGS=" -I${BASE}/ssl/include "
export LDFLAGS=" -L${BASE}/ssl/lib "
export LD_LIBRARY_PATH=${BASE}/ssl/lib:$LD_LIBRARY_PATH
./configure --prefix=${PREFIX} --enable-optimizations --with-lto --with-computed-gotos --with-system-ffi
make -j "$(nproc)"
make test
make altinstall
make clean
mkdir -p ${BASE}/.local/bin
(cd ${BASE}/bin ; ln -s python3.10 python)
cat <<EOF > ${BASE}/setup.source
#!/bin/bash
BASE=$BASE
PREFIX=$PREFIX
export LD_LIBRARY_PATH=\$BASE/ssl/lib:\$PREFIX/lib:\$LD_LIBRARY_PATH
export PATH=\$PREFIX/bin:\$PATH
EOF
Archive Build
tar Jxvf python-3.10.2.tar .xz $BASE
Common Setup Procedures
To bootstrap your new environment with all the tools frequently leveraged during development, see the below procedures.
Assumption: The variable BASE
is your user home directory, and python3.10 is on the path.
mkdir -p ${BASE}/ENV3
python3.10 -m venv --prompt ENV3 ~/ENV3
source ${BASE}/ENV3/bin/activate
pip install -U pip
pip install cloudmesh-installer
mkdir -p ~/git/cm
(cd ~/git/cm && cloudmesh-installer get cms)
echo "alias ENV3=\"source $BASE/ENV3/bin/activate\"" >> ~/.bash_profile
echo "alias EQ=\"cd $BASE/git\"" >> ~/.bash_profile
source ~/.bash_profile
EQ
git clone git@github.com:laszewsk/mlcommons.git
git clone git@github.com:laszewsk/mlcommons-data-earthquake.git
pip install -r mlcommons/examples/mnist-tensorflow/requirements.txt
pip install -r mlcommons/benchmarks/earthquake/new/requirements.txt
5.2 - Running MLCube on Rivanna
A gentle introduction to running MLCube on Rivanna
In this guide, we introduce MLCube and demonstrate how to run
workloads on Rivanna using the Singularity backend.
Running models consistently across platforms requires users to have
commanding knowledge of the configuration of not only the source code,
but also of the hardware ecosystem. It’s not uncommon that you’ll
encounter a project where configuring your system to get reproducible
results is error prone and time consuming, and ultimately not
productive to the analyst.
MLCube(tm) is a contract-driven approach to address system
configuration details and establishes a standard for generating
consistent models and a mechanism for delivering these models to
others, allowing others to benefit from having a solved environment.
Getting Started
First you need to install a runner for MLCube. The MLCube supports
many backend runners and should run on each of them equally.
For this walkthrough, we will target the Rivanna HPC ecosystem, so
we’ll leverage the lmod and singularity ecosystems.
Python install
We have two
choices to install python. One is with pyenv, the other is with conda.
If you decide to install it with pyenv, use the following steps
pyenv install 3.9.7
pyenv global 3.9.7
python -m venv --prompt mlcube venv
source venv/bin/activate
python -m pip install mlcube-singularity
If you decide to install it with conda, use the following steps
conda create -n mlcube -c conda-forge python=3.9.7
conda activate mlcube
# We use pip as conda does not have an mlcube repository
python -m pip install mlcube-singularity
Note that the mlcube-singularity
package can and should be installed
within your target environment.
Using MLCube
Once you have run the above commands, you will now have the MLCube
script available on your path and you can now list what runners mlcube
has registered with
$ mlcube config --get runners
# System settings file path = /home/<username>/mlcube.yaml
# singularity:
# pkg: mlcube_singularity
At this point you can run through any of the example projects that the
mlcube project hosts at
https://github.com/mlcommons/mlcube_examples.git.
Below is a set of procedures to run their hello world project.
git clone https://github.com/mlcommons/mlcube_examples.git
cd ./mlcube_examples/hello_world
mlcube run --mlcube=. --task=hello --platform=singularity
# No output expected.
mlcube run --mlcube=. --task=bye --platform=singularity
# No output expected.
cat ./workspace/chats/chat_with_alice.txt
# You should some log lines in this file.
Nontrivial example - Earthquake Data
Help wanted
We are looking to convert our earthquake model into an MLCube container.
5.4 - Installing Singularity on Windows Workstations
A procedure to get singularity running on WSL2
Singularity is a container-based runtime engine designed to run in permission constrained environments.
Singularity provides similar functions to systems like Docker, Containerd, and Podman, and provides an ecosystem to share a computer’s kernel and drivers and provide a filesystem based on overlaying files.
These overlays create a type of partitioned software that that can create isolated execution on the host as a type of “container”.
However, Singularity differs from typical container runtime engine, most notably:
- Singularity was designed to be run as a normal, non-root user and does not depend on a daemon.
- Singularity does not natively support OCI images (the typical container image format target), and uses its own SIF format; but OCI images can be imported.
- Singularity container images are distributed as files.
- Singularity was designed to create a container platform that works from laptops to HPC clusters.
(Windows Only) Setup on Window Subsystem for Linux
While not the normal place to install singularity, it is useful to have the ability to run commands from a local machine to validate command structure and workflows.
Singularity does not run natively on windows, but with Windows 10 Professional, you can build Singularity using a WSL2 distribution and provide the ability to run the commands on your workstation.
Enabling WSL2
To enable WSL2, follow microsoft’s instructions
Any version of linux will work with Singularity, but we recommend using Ubuntu.
Building Singularity
This process has been automated in ./tools/install-singularity-wsl2.bash
if you’re running Ubuntu.
However, the general flow of the instruction is:
- Install the singularity code dependencies (gcc, libssl, gpgme, squashfs, seccomp, wget, pkg-config, git, and cryptsetup)
- Install a modern version of golang.
- Download the Singularity source code from https://github.com/apptainer/singularity.git
- Run
./mconfig
from the singularity codebase
- Run
make && make install
from the ./builddir
directory.
These procedures are more thoroughly covered in the apptainer website at: https://apptainer.org/docs/user/main/quick_start.html#quick-installation-steps
Run your first singularity container
Once the build has completed, you should be able to run the singularity
command.
Try to run
$ singularity run docker://godlovedc/lolcow
If this command was successful you should see something similar to the following:
_____________________________________
/ You recoil from the crude; you tend \
\ naturally toward the exquisite. /
-------------------------------------
\ ^__^
\ (oo)\_______
(__)\ )\/\
||----w |
|| ||
5.5 - Running GPU Batch jobs on Rivanna
A short introduction on how to run GPU Jobs on Rivanna
We explain how to run GPU batch jobs using different GPU cards on
Rivanna. Rivanna is a supercomputer at the University of Virginia. This
tutorial is only useful if you can get an account on it. The
official documentation is available at
However, it includes some issues and does not explain certain
important aspects for using GPUs on it. Therefore, this guide has been
created.
PLEASE HELP US IMPROVE THIS GUIDE
Requirements
We require that you have
- A valid account on Rivanna
- A valid accounting group allowing you to run GPU jobs on Rivanna
Introduction
Rivanna is the High-Performance Computing (HPC) cluster
managed by University of Virginia’s Research Computing. Rivanna is
composed 575 nodes with a total of 20,476 cores and 8PB of different
types of storage. Table 1 shows an overview of the compute
nodes. Some of the compute nodes also includes these GPUs:
Table 1: GPUs on Rivanna
Cores/Node |
Memory/Node |
Specialty Hardware |
GPU memory/Device |
GPU devices/Node |
# of Nodes |
40 |
354GB |
- |
- |
- |
1 |
20 |
127GB |
- |
- |
- |
115 |
28 |
255GB |
- |
- |
- |
25 |
40 |
768GB |
- |
- |
- |
34 |
40 |
384GB |
- |
- |
- |
348 |
24 |
550GB |
- |
- |
- |
4 |
16 |
1000GB |
- |
- |
- |
5 |
48 |
1500GB |
- |
- |
- |
6 |
64 |
180GB |
KNL |
- |
- |
8 |
128 |
1000GB |
GPU: A100 |
40GB |
8 |
2 |
28 |
255GB |
GPU: K80 |
11GB |
8 |
9 |
28 |
255GB |
GPU: P100 |
12GB |
4 |
3 |
40 |
383GB |
GPU: RTX 2080 Ti |
11GB |
10 |
2 |
28 |
188GB |
GPU: V100 |
16GB |
4 |
1 |
40 |
384GB |
GPU: V100 |
32GB |
4 |
12 |
*) This information may be outdated
Access to Rivanna
Access to Rivanna is secured by University of Virginias
VPN. UVA
offers two different VPNs. We recommend that you install the UVA
Anywhere VPN. This can be installed on Linux, macOS and Windows.
After installation, you have to start the VPN. After that, you can use a
terminal to access Rivanna via ssh. If you have not used ssh, we
encourage you to read about it and explore commands such as ssh
,
ssh-keygen
, ssh-copy-id
, ssh-agent, and
ssh-add`.
Note: gitbash on Windows
Please note that on Windows, you are expected to install gitbash so
you can use the same commands and ssh logic as on Linux and Mac. For
this reason, we do not recommend putty
, PowerShell
or
cmd.exe
. This is because we can do scripting the same way, even from
those running Windows, and significantly simplifies this guide.
We will not provide an extensive tutorial on how to use
ssh, but you can contribute it. Instead, we will summarize the most important steps:
-
Create an ssh key if you have not done that before
It is VERY important that you create the key with a strong passphrase.
-
Add an abbreviation for Rivanna to your ~/.ssh/config
file
Use your favorite editor. Mine is emacs
emacs ~/.ssh/config
copy and paste the following into that file, where abc1de
is to be substituted by your
UVA compute id.
Host rivanna
User abc1de
HostName rivanna.hpc.virginia.edu
IdentityFile ~/.ssh/id_rsa.pub
This will allow you to use rivanna
instead of abc1de@rivanna.hpc.virginia.edu
.
The next steps assume you have done this and can use just rivanna
-
Copy your public key to rivanna
This will copy your public key into the
rivanna:~/.ssh/authorized_keys
file.
-
After this step, you can use your keys to authenticate. You still
need to be using the VPN, though.
The most convenient system for it is Mac and Ubuntu. It
already has a tool installed called ssh-agent and keychain. In
Windows under gitbash you need to start it with
First, you add the key to your session, so you do not have to
constantly type in the password. Use the command
to test if it works, just say
which will print the hostname of Rivanna
In case your machine does not run ssh-agent, you can start it
before you type in the ssh-add command with
If everything is set up correctly, it will return the string
-
To login to Rivanna, simply say
“`bash
ssh rivanna
If this does not work, you have made a mistake. Please, review the
previous steps carefully.
Running Jobs on Rivanna
Jobs on Rivanna can be scheduled through Slurm either as a batch job or
as an interactive job. In order to achieve this, one needs to load the
software first and create special scripts that are used to submit them
to nodes that contain the GPUs you specify.
The user documentation about this is provided here:
However, at the time when we looked at it, it had some mistakes and
limitations that we hope to overcome here.
Modules
Rivanna’s default mechanism of software configuration management is
done via
modules. The UVA
modules documentation is provided through this
link.
Modules provide the ability to load a particular software stack and
configuration into your shell but also into your batch jobs. You can
load multiple modules in your environment to load them in order.
To list the available modules, log into Rivanna and use the command
To list aproximately, the python modules use
It will return all modules that have py in it. Please chose those that
look like python modules.
To probe for deep learning modules, use something similar to
$ module available cuda tensorflow pytorch mxnet nvidia cudnn
Python
Different versions of python are available.
To load python 3.8 we can say
$ module load anaconda/2020.11-py3.8
To load Python 3.10.0 we can say
$ module load anaconda
$ conda create -n py3.10 python=3.10
$ source activate py3.10
$ python -V
Python 3.10.0
Please note that at this time anaconda did not support 3.10.2, which I
run personally on my computer, but from python.org.
Adding Modules with Spider
Details about modules can be identified with the module spider
command.
If you type it in you get a list of many available configurations.
Spider can take a keyword and lists all available version the keyword matches.
Let us demonstrate it on
----------------------------------------------------------------------------
python:
----------------------------------------------------------------------------
Description:
Python is a programming language that lets you work more effectively.
Versions:
python/2.7.16
python/3.6.6
python/3.6.8
python/3.7.7
python/3.8.8
Other possible modules matches:
biopython openslide-python wxpython
----------------------------------------------------------------------------
...
For detailed information about a specific “python” package use the module’s full name.
$ module spider python/3.8.8
This will return a page with lots of information. The most important one for us is
You will need to load all module(s) on any one of the lines below before the
"python/3.8.8" module is available to load.
gcc/11.2.0 openmpi/3.1.6
gcc/9.2.0 cuda/11.0.228 openmpi/3.1.6
gcc/9.2.0 mvapich2/2.3.3
gcc/9.2.0 openmpi/3.1.6
gcccuda/9.2.0_11.0.228 openmpi/3.1.6
goolfc/9.2.0_3.1.6_11.0.228
Here you see various options that need to be loaded in BEFORE you load python.
Thus to properly load python 3.8.8 you need to say (if this is what you chose):
module load gcc/11.2.0
module load openmpi/3.1.6
module spider python/3.8.8
Modules for tensorflow
module load singularity/3.7.1
module load tensorflow/2.7.0
Modules for pytorch
module load singularity/3.7.1
module lod pytorch/1.10.0
Containers
Rivanna uses singularity as container technology. The documentation
specific to singularity for Rivanna is avalable at this
link
Singularity needs to be also loaded as a module befor it can be used.
Singularity containers have the ability to access
GPUs
via a passthrough using NVidia drivers. Once you load singularity you
can use it as follows:
singularity <cmd> --nv <imagefile> <args>
The container will be used inside a job.
Jobs
More detail specific to jobs for Rivanna is provided
here.
Before we start an example, we explain how we create a job first in a
job description file and then submit it to Rivanna. We use a simple
MNIST example showcases the aspects of successfully running a job on
the machine. We will therefore focus on creating jobs using GPUs.
New 8 A100 GPUs to be added
Rivanna will have eight nodes available to us, but they are not yet in service.
Instead, we will be using the two existing nodes shared with other users.
Rivanna uses the SLURM job scheduler for allocating submitted jobs.
Jobs are charged SUs from an allocation. The Rivanna compute
allocation. Please contact your supervisor for the name of the allocation. Gregor’s allocation is named
and it currently contains 100k SUs. Students from the UVA capstone
class will have the following allocation:
To see the available SUs for your project, please use the command
allocations
allocations -a <allocation_name>
SUs can be requested via the Standard Allocation Renewal
form. Due
to the limitation, we encourage you to plan things and try to
avoid unnecessary runs. General instructions for submitting SLURM jobs
is located at
To request the job be submitted to the GPU partition, you use the option
-p gpu
The A100 GPUs are a requestable resource. To request them, you would
add the gres option with the number of A100 GPUs requested (1 through
8 GPUs), for example, to request 2 A100 GPUs,
--gres=gpu:a100:2
.
If you are using a SLURM script to submit the job the options
would appear as follows. Your script will need to specify other
options such as the allocation to charge as seen in the sample scripts
shown in the above URL:
#SBATCH -p gpu
#SBATCH --gres=gpu:a100:2
#SBATCH -A bii_dsc
Interactive Jobs
Please avoid running interactive jobs as they may waste
SUs, and we are charged by you keeping the A100 idle.
Although Research Computing also offers some interactive apps such as
JupyterLab, RStudio, CodeServer, Blender, Mathematica via our Open
OnDemand portal at:
we ask you to avoid using them for benchmarks.
To request the use of the A100s via Open OnDemand, first log in to the
Open the OnDemand portal select the desired interactive app. You will be
presented with a form to complete. Currently, you would
- select
gpu
for Rivanna partition,
- select
NVIDIA A100
from the Optional: GPU type for GPU partition
pulldown menu and enter the number of desired GPUs from the
Optional: Number of GPUs
. Once you’ve completed the form, click
the Launch
button and your session will be launched. The session
will start once the resources are available.
Using the MNIST example
For now, the code is located at:
A sample slurm job specification is included at
To run it use the command
$ sbatch mnist-rivanna-a100.slurm
NOTE: We want to improve the script to make sure it is running on a
GPU and add GPU placement commands into the code.
Custom Version of TensorFlow
https://www.rc.virginia.edu/userinfo/rivanna/software/tensorflow/
Keras on Rivanna
Building a Python verion from Source
Requirements
This section is under development
Why do you wnat to do this?
How is it been done?
Whe have developed the following script to create the enfironment on rivanna
\url{httplatex ://example.com}
You can download the script from git with wget
and place it in a driectory. running it with
$ python-install.py --version="3.10.2" --host=rivanna
will create an optimized version for rivanna. Other options can be found with
python-install.py help
Where do you want to place it
scratch vs home dir
How do you access it?
deployment into your own environment
benchmarks vs the various versions on python here. This needs to be reproducible when we have a new version of python
How to cite if you use this
This work was conducted as part of the mlcommons science benchmark earthquake project and if youl ike to reuse it we like that you cite the following paper:
@TechReport{mlcommons-eartquake,
author = {Thomas Butler and Robert Knuuti and
Jake Kolessar and Geoffrey C. Fox and
Gregor von Laszewski and Judy Fox},
title = {MLCommons Earthquake Science Benchmark},
institution = {MLCommons Science Working Group},
year = 2022,
type = {Report by University of Virginia},
address = {Charlottesville, VA},
month = may,
note = {The order of the authors and url location may change},
annote = {Version: draft},
url = {https://github.com/cyberaide/paper-capstone-mlcommons}
}