Download data from the grid
Data download from the grid#
Preparation#
The Grid is a cooperation of many different clusters and research organizations, and as such, there is no centralized user management. Yet, there must be a way for the system to identify you and your work. This is why Grid certificates and Virtual Organizations (VOs) are introduced.
Your digital identity starts with a private key. Only you are allowed to know the contents of this key. Next, you need a Grid certificate, which is issued by a Certificate Authority (CA). The Grid certificate contains your name and your organization, and it says that the person who owns the private key is really the person mentioned, and that this is certified by the Certificate Authority.
Now this is your identity. Big international collaborations do not want to deal with every user individually. Instead, users become part of Virtual Organizations. To give an analogy, the Grid certificate provides authentication (identity, e.g., like a passport) and the VO provides authorization (approval, e.g., like a visa).
In order to access to the Grid, you have to make three essential steps:
- Get a Grid certificate, so that you can be identified on the Grid.
- Join the AGATA Virtual Organization (VO), so that you can access to the Grid.
You need then a User Interface (UI) that provide the proper environment to interact with the Grid. This user interface can be either installed by your IT services, or obtained from the AGATA Grid doccker image.
This manual explain how to use the docker image, and how to download the data from the grid using the python script of the AGATA collaboration.
Is is inspired by the original AGATA Grid userguide document accessible here
Get a Grid certificate#
Grid certificates are supplied by a Certificate Authority (CA). The procedure for acquiring a grid certificate is different at each university, laboratory etc. If you have some active grid users or specialists at your home institute, then ask them for advice on how to get a grid certificate. Note that AGATA is NOT issuing any grid certificates.
Once you get your grid certificate, you need first to load it in your browser, this will allow you to request to
join the AGATA Virtual Organisation (VO). You can then download from your browser your certificate file in .p12
format.
- Firefox:
- Edit/Preferences
- Privacy & security / Certificates / View certificates
- in Your certificates tab: Select the certificate and click on Export. Give the file a name and save it as .p12 (you will be asked for a password to protect it).
- Chrome:
- Go to the local URL chrome://settings/certificates
- in Your certificates tab: Select the certificate and click on Export. Give the file a name and save it as .p12 (you will be asked for a password to protect it).
This certificate needs then to be converted in a format used by the grid UI. This process is done via a script in the docker image. See the original documentation for more details.
Join the AGATA Virtual Organization#
A Virtual Organization (VO) is a group of geographically distributed people that have common objectives and that are using shared Grid resources to achieve them. Every Grid user is a member of one or more VOs.
In practice your VO membership determines to which resources (compute and storage) you have access to. You are eligible to register for a VO only once you get a valid certificate. To join the AGATA VO, you must have the grid certificate successfully installed in your browser.
The membership is managed by an INDIGO IAM server. Few steps needs to be executed on this server to be part of the AGATA VO:
Single Sign-On (SSO) authentification#
The first connextion needs to be done using an identity provider. On our server, both EDUGAIN and ORCID are available. For the first connexion to the INDIGO IAM server, please use one of them to register and fill the registration form.
Once validated, you will be able to access the IAM dashboard, as below:

Link certificate#
The next step is to link your grid certificate, using the "Link certificate" button. Once its done, you will be able to connect to the IAM server either using the SSO or with your certificate. Your certificate will now appear on the dashboard:

Join AGATA VO group#
Finally, you need to register to the AGATA VO group, usung the "Join a group" button, and selecting the group "vo.agata.org". Once your account is be validated by the admins, you will be able to see that you are well associated to this group, as below:

For any further information and help about the AGATA VO, please contact us (agatadp@ip2i.in2p3.fr)
Some basic information on the data management on the grid#
The dCache storage consists of magnetic tape storage and hard disk storage. Before it can be used, the data
stored on magnetic tape has to be copied to a hard drive. This action is called Staging files or bringing a file
online. When a file status is NEARLINE
, this means that the file is on tape only. On the contrary, the
status ONLINE
means that the file is on disk only.
Before downloading data, a bring_online operation is necessary to copy all the files from the tapes to the disks. The disk pool where your files are staged has limited capacity and is only meant for data that a user wants to process. When you stage a file, you set a pin lifetime. The pin lifetime here is set to 1 week, but note that this counts from the moment you submit the request independent to the actual time that the files are on disk. The file will not be purged until the pin lifetime has expired. Then the data may be purged from disk, as soon as the space is required for new stage requests. When the disk copy has been purged, it has to be staged again in order to be processed on a Worker Node.
When a pool group is full with pinned files, staging is paused. Stage requests will just wait until pin lifetimes for other files expire. dCache will then use the released space to stage more files until the pool group is full again. When you are done with your processing, we recommend you release (or unpin) all the files that you don’t need any more. It is an optional action, but helps a lot with the effective system usage.
The AGATA collaboration script automatically unpins the files after being downloaded, but if a list of file has been brought online but not downloaded, please use the release command to unpin the files.
AGATA Grid docker image#
Dowload docker image#
The AGATA collaboration is sharing a docker image with the Grid UI installed. This docker image generation is done here
For the following, the docker application needs to be installed.
To install the AGATA Grid docker image:
Start docker image#
You then need to define two environment variables:
- CERTIF_DIR: repository on you computer containing your certificate in .p12
format. This is only required once, to produce the certificated in the .pem
format, required by the grid UI. It will mount this folder in the /root
folder of the docker image.
- DATA_DIR: repository where the data will be dowloaded
Assuming CERTIF_DIR
is /path/to/my/certificates
and DATA_DIR
is /path/to/data
, apply:
Then, the docker image is stared using:
docker run -it --rm -v ${CERTIF_DIR}:/root -v ${DATA_DIR}:/data gitlab-registry.in2p3.fr/ip2igamma/docker_images:agata_grid_IAM
The docker image should start in the /opt/AgataGrid
where the following files must be present:
root@:/opt/AgataGrid$ ls -l
total 40
-rwxr-xr-x 1 root root 29755 Jun 21 12:12 GridDataSync.py
-rwxr-xr-x 1 root root 929 Jun 21 12:12 do_grid_completion.sh
-rwxrwxrwx 1 root root 1274 Jun 21 12:12 gen_cert.sh
The AGATA Grid python script is GridDataSync.py
. do_grid_completion.sh
is the script auto completion file, that allows to automatically propose the script options when pressing TAB
.
The gen_cert.sh
script is used to generate the certificate in the .pem
format, required for the grid UI.
Generate certificate files in grid format#
This step is only required once. It will create in the same folder than your grid .p12
certificate, two .pem
certificate files that are required for the grid UI.
To generate these files, launch the gen_cert.sh
script. This will produce these files in a folder named .globus
. If this folder is already existing, you need to remove it first.
The script requires you to provide the name of the .p12
certificate, that should be located in the /root
folder of your docker image, according to your CERTIF_DIR
environment variable.
This will ask your certificate password two times (once for each .pem
file to be produced), and then it will ask you to define (and confirm) the password you want to apply the the .pem
certificate file.
root@:/opt/AgataGrid$ ./gen_cert.sh /root/DUDOUET_GRID_2023.p12
Enter Import Password:
MAC verified OK
Enter Import Password:
MAC verified OK
Enter PEM pass phrase:
Verifying - Enter PEM pass phrase:
The files usercert.pem and userkey.pem have been created and moved to /root/.globus
To test that everything is working as it should, you can try to create a new grid proxy:
root@:/opt/AgataGrid$ GridDataSync.py --new_proxy
Enter GRID pass phrase for this identity:
Contacting cclcgvomsli01.in2p3.fr:15007 [/O=GRID-FR/C=FR/O=CNRS/OU=CC-IN2P3/CN=cclcgvomsli01.in2p3.fr] "vo.agata.org"...
Remote VOMS server contacted succesfully.
Created proxy in /tmp/x509up_u0.
Your proxy is valid until Mon Jun 24 13:25:43 UTC 2024
Script user guide#
Presentation of the different options on the script#
To show on the terminal the available options, use the --help
option:
root@:/opt/AgataGrid$ GridDataSync.py --help
Usage: GridDataSync.py [options]
Browse and download AGATA data from the grid
Options:
-h, --help show this help message and exit
--new_proxy create a new proxy
--proxy_status print the proxy status
--from_LYON download data from CC Lyon (default)
--from_CNAF download data from Bologna
--show_conf show the current configuration (paths, patterns)
--ls_dir list the content of the given folder
--input_dir=path copy grid data from distant path
--output_dir=path copy grid data into local path
--exc=patt exclude patterns, separated by ":", will skip all files containing exc patterns (none to reset)
--inc=patt include patterns, separated by ":", will skip all files not containing inc patterns (none to reset)
(check https://regexone.com/references/python for python regexp format)
--build_list build the list of files to be downloaded (mandatory before start)
--bring_online move files from tape to disks (make the copy of files faster)
--check_status check the status of the files to be downloaded (locality, downloaded...)
--verbose increase the verbosity
--start launches the download of the files from the grid
--force force the download of offline files (much slower)
--nochecksum remove the checksum on each downloaded file
--overwrite Overwrite the already downloaded files
--release release all files from disk
In the following, the different options will be presented:
help:#
Print the above message. It will be also printed if no option is given
new_proxy:#
This command will create a new proxy, valid during 72 hours (this require your certificate password)
proxy_status:#
This command will show the status of your proxy, including the remaining time to use it:
root@:/opt/AgataGrid$ GridDataSync.py --proxy_status
subject : /O=GRID-FR/C=FR/O=CNRS/OU=IPNL/CN=Jeremie Dudouet/CN=137159801
issuer : /O=GRID-FR/C=FR/O=CNRS/OU=IPNL/CN=Jeremie Dudouet
identity : /O=GRID-FR/C=FR/O=CNRS/OU=IPNL/CN=Jeremie Dudouet
type : RFC3820 compliant impersonation proxy
strength : 2048
path : /tmp/x509up_u0
timeleft : 71:54:08
key usage : Digital Signature, Non Repudiation, Key Encipherment, Data Encipherment, Key Agreement
=== VO vo.agata.org extension information ===
VO : vo.agata.org
subject : /O=GRID-FR/C=FR/O=CNRS/OU=IPNL/CN=Jeremie Dudouet
issuer : /DC=org/DC=terena/DC=tcs/C=FR/ST=Paris/O=Centre national de la recherche scientifique/CN=cclcgvomsli01.in2p3.fr
attribute : /vo.agata.org/Role=NULL/Capability=NULL
timeleft : 71:54:07
uri : cclcgvomsli01.in2p3.fr:15007
from_LYON:#
To download the data from the Lyon Tier1 site (this is the default value)
from_CNAF:#
To download the data from the Bologna Tier1 site
show_conf:#
To store the different parameters, a configuration file is automatically created at the first execution of the program and updated as a function of the parametrization.
To see the status of the configuration file, use this command, or check the file Grid.conf
:
root@:/opt/AgataGrid$ GridDataSync.py --show_conf
********************************
** GridDataSync configuration **
********************************
SERVER : srm://ccsrm02.in2p3.fr:8443/srm/managerv2?SFN=
BASE_DIR_ON_GRID : /pnfs/in2p3.fr/data/agata/
OUTPUTDIR : /data/
ls_dir:#
This allows to show the arborescence of a folder from the grid. Different cases are possible:
- without argument: this will print the arborescence of the base directory of the server on which you are connected:
root@:/opt/AgataGrid$ GridDataSync.py --ls_dir
********************************
** GridDataSync configuration **
********************************
SERVER : srm://ccsrm02.in2p3.fr:8443/srm/managerv2?SFN=
BASE_DIR_ON_GRID : /pnfs/in2p3.fr/data/agata/
-- List content of folder: srm://ccsrm02.in2p3.fr:8443/srm/managerv2?SFN=/pnfs/in2p3.fr/data/agata/
Folder 02 Sep 2010 lschwarz
Folder 08 Sep 2010 pietro_test
Folder 21 Sep 2010 2010_week19
Folder 22 Oct 2010 Milano
Folder 25 Oct 2010 2010_week28
Folder 24 Mar 2011 2010_week48
80.6 MB 31 Mar 2011 test_small_file_johan.tar
Folder 01 Apr 2011 252Cf
Folder 22 Feb 2013 generated
Folder 26 Apr 2013 kaci-test
Folder 07 Oct 2013 ycalas
Folder 12 Oct 2013 121113_boutachkov_pietri
158.0 B 15 Oct 2013 host_adler32
Folder 28 Nov 2013 121001_rudolph
...
- by giving as argument a sub folder:
root@:/opt/AgataGrid$ GridDataSync.py --ls_dir e680/e680
********************************
** GridDataSync configuration **
********************************
SERVER : srm://ccsrm02.in2p3.fr:8443/srm/managerv2?SFN=
BASE_DIR_ON_GRID : /pnfs/in2p3.fr/data/agata/
-- List content of folder: srm://ccsrm02.in2p3.fr:8443/srm/managerv2?SFN=/pnfs/in2p3.fr/data/agata/e680/e680
Folder 09 Jun 2015 NarvalAGATAsolo
Folder 09 Jun 2015 Replay
Folder 09 Jun 2015 ReplayFromAnalysisServer
Folder 09 Jun 2015 run_0001.dat.07-05-15_14h17m05s
Folder 09 Jun 2015 run_0002.dat.07-05-15_14h22m27s
Folder 09 Jun 2015 run_0003.dat.11-05-15_09h55m42s
Folder 09 Jun 2015 run_0004.dat.11-05-15_10h06m11s
Folder 09 Jun 2015 run_0005.dat.11-05-15_11h05m02s
Folder 09 Jun 2015 run_0006.dat.11-05-15_11h58m20s
Folder 09 Jun 2015 run_0007.dat.11-05-15_12h16m13s
Folder 09 Jun 2015 run_0008.dat.11-05-15_12h26m07s
Folder 09 Jun 2015 run_0009.dat.11-05-15_12h34m52s
Folder 09 Jun 2015 run_0010.dat.11-05-15_13h04m12s
Folder 09 Jun 2015 run_0011.dat.11-05-15_13h55m12s
Folder 09 Jun 2015 run_0012.dat.11-05-15_14h20m10s
Folder 09 Jun 2015 run_0013.dat.11-05-15_14h41m16s
...
This --ls_dir
command will help you to move through the grid arborescence in order to find the folder you
want to download.
input_dir:#
Once you have found the folder you want to download, you will set it as input_dir with this option.
exc:#
This option allows to define exclude patterns to limit the type of data you want to download. The exclude patterns are using the regular expression norm of python. You can find more information on this link: https://regexone.com/references/python.
When you define exclude patterns, this will exclude, when building the list of files (see below) all file names that contains this (or these) patterns. If more than one pattern is asked, it needs to be separated by “:”
Examples:
This command will exclude all .tar and cdat files. To remove the exclude patterns, you need to use this --exc
option with the none
argument
inc:#
Similarly, this --inc option allows to specify include patterns, which means patterns that are necessarily present
in the file name to be added to the list of files. The same procedure is applied than for the --exc
option.
Examples:
This command will only save files for the crystals starting with 03 (03A, 03B, 03C). Here also, to remove the include patterns, you need to use this --inc
option with the none
argument.
Include and exclude patterns are recalled in the –show_conf
option.
build_list:#
Once the input_dir has been set, it is necessary to build the list of all the files of this folder. This is done using this option. This operation will build a catalog grid_summary.csv. This file allows you to follow the status of the files (urls, tape or disk locality, download status).
bring_online:#
This mandatory operation consists in copying the files from tapes to disks. The --bring_online
command will launch a background task on the grid server to copy all your files on the disks. You can wait
during this process to see its evolution, or closing it and letting the staging processing in background. For large
number of files, the staging needs to be done by batch of 1000 files with 1 minute interval time to prevent
overloading the staging namespace server.
check_status:#
This command allows to check the status process. It will print the locality of the files and the download status. The catalog will be updated to take into account the advancement of the bring_online operation
To have more information, the options --verbose
can be added to this command.
Examples:
... updating the catalog
-- 117 Files from e806/e806/run_0204.dat.05-07-21_00h12m13s in the list
==> 3 downloaded (114 remaining)
==> 114 brought online for non downloaded files (0 remaining)
Here we can see that all the files have been copied on disks, and 3 are already downloaded.
start:#
This command will launch the download. By default, a checksum is done after each downloaded file. To remove this operation, using the –nochecksum
option.
To overwrite the files already downloaded, use the --overwrite
option.
release:#
This command will release all the files that have been pinned using the --bring_online
option. In case the
files will finally not be downloaded for example.
Example: download a folder on a local computer#
In this example, we will download the configuration files and the traces (cdat files) for crystals 03ABC in the run 003 of the e680 experiment from the Lyon grid server:
Create a new proxy if necessary:#
Move in the arborescence of the grid up to the good folder:#
root@:/opt/AgataGrid$ GridDataSync.py --ls_dir e680
********************************
** GridDataSync configuration **
********************************
SERVER : srm://ccsrm02.in2p3.fr:8443/srm/managerv2?SFN=
BASE_DIR_ON_GRID : /pnfs/in2p3.fr/data/agata/
OUTPUTDIR : /data/
-- List content of folder: srm://ccsrm02.in2p3.fr:8443/srm/managerv2?SFN=/pnfs/in2p3.fr/data/agata/e680
Folder 09 Jun 2015 e680
Folder 09 Jun 2015 ReadATCA
Folder 15 Jun 2015 e680_NoTraces
root@:/opt/AgataGrid$ GridDataSync.py --ls_dir e680/e680
********************************
** GridDataSync configuration **
********************************
SERVER : srm://ccsrm02.in2p3.fr:8443/srm/managerv2?SFN=
BASE_DIR_ON_GRID : /pnfs/in2p3.fr/data/agata/
OUTPUTDIR : /data/
-- List content of folder: srm://ccsrm02.in2p3.fr:8443/srm/managerv2?SFN=/pnfs/in2p3.fr/data/agata/e680/e680
Folder 09 Jun 2015 NarvalAGATAsolo
Folder 09 Jun 2015 Replay
Folder 09 Jun 2015 ReplayFromAnalysisServer
Folder 09 Jun 2015 run_0001.dat.07-05-15_14h17m05s
Folder 09 Jun 2015 run_0002.dat.07-05-15_14h22m27s
Folder 09 Jun 2015 run_0003.dat.11-05-15_09h55m42s
Folder 09 Jun 2015 run_0004.dat.11-05-15_10h06m11s
Folder 09 Jun 2015 run_0005.dat.11-05-15_11h05m02s
Folder 09 Jun 2015 run_0006.dat.11-05-15_11h58m20s
Folder 09 Jun 2015 run_0007.dat.11-05-15_12h16m13s
Folder 09 Jun 2015 run_0008.dat.11-05-15_12h26m07s
Folder 09 Jun 2015 run_0009.dat.11-05-15_12h34m52s
Folder 09 Jun 2015 run_0010.dat.11-05-15_13h04m12s
Folder 09 Jun 2015 run_0011.dat.11-05-15_13h55m12s
...
Define the input folder:#
Define the output folder:#
In principle, this step is not required because it is already mounted in the docker image and defined by default to /data
Define the include and exclude patterns:#
Here: .*.adf
will exclude adf files, and .*/03./.*
will include 03(ABC) crystals.
Build the list of files to be dowloaded (catalog generation):#
root@:/opt/AgataGrid$ GridDataSync.py --build_list
********************************
** GridDataSync configuration **
********************************
SERVER : srm://ccsrm02.in2p3.fr:8443/srm/managerv2?SFN=
BASE_DIR_ON_GRID : /pnfs/in2p3.fr/data/agata/
INPUTDIR : e680/e680/run_0003.dat.11-05-15_09h55m42s
OUTPUTDIR : /data/
Include pattern : .*/03./.*
Exclude pattern : .*.adf
==> All the files have been released from disks
=> adding: 80.0 kB e680/e680/run_0003.dat.11-05-15_09h55m42s/Conf/03A/SRM_AGATA_small_files.tar
=> adding: 80.0 kB e680/e680/run_0003.dat.11-05-15_09h55m42s/Conf/03B/SRM_AGATA_small_files.tar
=> adding: 80.0 kB e680/e680/run_0003.dat.11-05-15_09h55m42s/Conf/03C/SRM_AGATA_small_files.tar
=> adding: 1020.0 MB e680/e680/run_0003.dat.11-05-15_09h55m42s/Data/03A/SRM_AGATA_event_mezzdata.cdat.0000
=> adding: 64.5 MB e680/e680/run_0003.dat.11-05-15_09h55m42s/Data/03A/SRM_AGATA_small_files.tar
=> adding: 895.0 MB e680/e680/run_0003.dat.11-05-15_09h55m42s/Data/03B/SRM_AGATA_event_mezzdata.cdat.0000
=> adding: 68.8 MB e680/e680/run_0003.dat.11-05-15_09h55m42s/Data/03B/SRM_AGATA_small_files.tar
=> adding: 890.0 MB e680/e680/run_0003.dat.11-05-15_09h55m42s/Data/03C/SRM_AGATA_event_mezzdata.cdat.0000
=> adding: 40.4 MB e680/e680/run_0003.dat.11-05-15_09h55m42s/Data/03C/SRM_AGATA_small_files.tar
-- 9 Files from e680/e680/run_0003.dat.11-05-15_09h55m42s added to the list of files to be downloaded
--> Total files size: 2.9 GB
Copy the files from tapes to disks:#
The --bring_online
can take time. Once it is launched, you can exit using CTRL+C command, the staging of the files will continue anyway.
root@:/opt/AgataGrid$ GridDataSync.py --bring_online
********************************
** GridDataSync configuration **
********************************
SERVER : srm://ccsrm02.in2p3.fr:8443/srm/managerv2?SFN=
BASE_DIR_ON_GRID : /pnfs/in2p3.fr/data/agata/
INPUTDIR : e680/e680/run_0003.dat.11-05-15_09h55m42s
OUTPUTDIR : /data/
Include pattern : .*/03./.*
Exclude pattern : .*.adf*
... Start the copy of files from tapes to disks ...
-> press CTRL+C to skip the display (the staging operation will keep working in background)
Number of files to be bring online: 9
** Staging launched for 9/9 files **
[\] ... 4 files staged over 9, 5 remaining ... stopping the dispay, the stagging continues in background...
The status of the --bring_online
command can then be checked with the --check_status
command.
Check the locality of the files:#
Once the staging has been launched, you need to wait few minutes and test with the --check_status
command
until the first files are in the ONLINE_AND_NEARLINE
status
root@:/opt/AgataGrid$ GridDataSync.py --check_status --verbose
********************************
** GridDataSync configuration **
********************************
SERVER : srm://ccsrm02.in2p3.fr:8443/srm/managerv2?SFN=
BASE_DIR_ON_GRID : /pnfs/in2p3.fr/data/agata/
INPUTDIR : e680/e680/run_0003.dat.11-05-15_09h55m42s
OUTPUTDIR : /data/
Include pattern : .*/03./.*
Exclude pattern : .*.adf
... updating the catalog
ONLINE_AND_NEARLINE : e680/e680/run_0003.dat.11-05-15_09h55m42s/Conf/03A/SRM_AGATA_small_files.tar
ONLINE_AND_NEARLINE : e680/e680/run_0003.dat.11-05-15_09h55m42s/Conf/03B/SRM_AGATA_small_files.tar
ONLINE_AND_NEARLINE : e680/e680/run_0003.dat.11-05-15_09h55m42s/Conf/03C/SRM_AGATA_small_files.tar
NEARLINE : e680/e680/run_0003.dat.11-05-15_09h55m42s/Data/03A/SRM_AGATA_event_mezzdata.cdat.0000
NEARLINE : e680/e680/run_0003.dat.11-05-15_09h55m42s/Data/03A/SRM_AGATA_small_files.tar
NEARLINE : e680/e680/run_0003.dat.11-05-15_09h55m42s/Data/03B/SRM_AGATA_event_mezzdata.cdat.0000
NEARLINE : e680/e680/run_0003.dat.11-05-15_09h55m42s/Data/03B/SRM_AGATA_small_files.tar
NEARLINE : e680/e680/run_0003.dat.11-05-15_09h55m42s/Data/03C/SRM_AGATA_event_mezzdata.cdat.0000
ONLINE_AND_NEARLINE : e680/e680/run_0003.dat.11-05-15_09h55m42s/Data/03C/SRM_AGATA_small_files.tar
-- 9 Files from e680/e680/run_0003.dat.11-05-15_09h55m42s in the list
==> 0 downloaded (9 remaining)
==> 4 brought online for non downloaded files (5 remaining)
Here, 5 files are still to be copied on the disks. If we redo this operation few minutes later, we obtain:
root@:/opt/AgataGrid$ GridDataSync.py --check_status --verbose
********************************
** GridDataSync configuration **
********************************
SERVER : srm://ccsrm02.in2p3.fr:8443/srm/managerv2?SFN=
BASE_DIR_ON_GRID : /pnfs/in2p3.fr/data/agata/
INPUTDIR : e680/e680/run_0003.dat.11-05-15_09h55m42s
OUTPUTDIR : /data/
Include pattern : .*/03./.*
Exclude pattern : .*.adf
... updating the catalog
ONLINE_AND_NEARLINE : e680/e680/run_0003.dat.11-05-15_09h55m42s/Conf/03A/SRM_AGATA_small_files.tar
ONLINE_AND_NEARLINE : e680/e680/run_0003.dat.11-05-15_09h55m42s/Conf/03B/SRM_AGATA_small_files.tar
ONLINE_AND_NEARLINE : e680/e680/run_0003.dat.11-05-15_09h55m42s/Conf/03C/SRM_AGATA_small_files.tar
ONLINE_AND_NEARLINE : e680/e680/run_0003.dat.11-05-15_09h55m42s/Data/03A/SRM_AGATA_event_mezzdata.cdat.0000
ONLINE_AND_NEARLINE : e680/e680/run_0003.dat.11-05-15_09h55m42s/Data/03A/SRM_AGATA_small_files.tar
ONLINE_AND_NEARLINE : e680/e680/run_0003.dat.11-05-15_09h55m42s/Data/03B/SRM_AGATA_event_mezzdata.cdat.0000
ONLINE_AND_NEARLINE : e680/e680/run_0003.dat.11-05-15_09h55m42s/Data/03B/SRM_AGATA_small_files.tar
ONLINE_AND_NEARLINE : e680/e680/run_0003.dat.11-05-15_09h55m42s/Data/03C/SRM_AGATA_event_mezzdata.cdat.0000
ONLINE_AND_NEARLINE : e680/e680/run_0003.dat.11-05-15_09h55m42s/Data/03C/SRM_AGATA_small_files.tar
-- 9 Files from e680/e680/run_0003.dat.11-05-15_09h55m42s in the list
==> 0 downloaded (9 remaining)
==> 9 brought online for non downloaded files (0 remaining)
All the files are now on disks and ready to be downloaded. It has to be noted that this is not mandatory to wait that all the files are staged to start the download. The catalog will be updated and restarting the download later will only download the missing files.
Download the files:#
Once its ready, start the download with the --start
option:
root@:/opt/AgataGrid$ GridDataSync.py --start
********************************
** GridDataSync configuration **
********************************
SERVER : srm://ccsrm02.in2p3.fr:8443/srm/managerv2?SFN=
BASE_DIR_ON_GRID : /pnfs/in2p3.fr/data/agata/
INPUTDIR : e680/e680/run_0003.dat.11-05-15_09h55m42s
OUTPUTDIR : /data/
Include pattern : .*/03./.*
Exclude pattern : .*.adf
... updating the catalog
-- 9 Files from e680/e680/run_0003.dat.11-05-15_09h55m42s in the list
==> 0 downloaded (9 remaining)
==> 9 brought online for non downloaded files (0 remaining)
...starting to download the 9 requested files...
Copied files: 9/9, current: 40.4 MB, total: 2.9 GB/ 2.9 GB, rate= 27.6 MB/s
-- 9 Files have been to be downloaded in 1min, 41s,
... updating the catalog
-- 9 Files from e680/e680/run_0003.dat.11-05-15_09h55m42s in the list
==> All the files have been downloaded