Auton Lab ML datasets repo

Predrag Punosevac predragp at andrew.cmu.edu
Fri Mar 27 00:10:14 EDT 2020


Dear Autonians,

The Auton Lab has a new  NFS share called

/zfsauton/datasets

where people could find standard datasets commonly used by our lab
members. For now there are only two of them but I will be happy to add
more as you make them available or point me to the sources.

Best,
Predrag

On Wed, Mar 25, 2020 at 7:09 PM Hitesh Arora <hiteshar at andrew.cmu.edu> wrote:
>
> Hi Predrag,
>
> That's a great idea.
>
> Following are the couple dataset/binaries we could share. Would request others to share other datasets.
>
> 1. CARLA binaries : /zfsauton2/home/hiteshar/CARLA_0.9.6
> (These are currently being used by at least 6-8 people where each has their own copies)
>
> 2. Cityscapes: /zfsauton/project/public/deep_clustering/data/datasets/cityscapes
> (Thanks to Vincent for sharing)
>
> Thanks,
> Hitesh
>
> On Wed, Mar 25, 2020 at 6:46 PM Predrag Punosevac <predragp at andrew.cmu.edu> wrote:
>>
>> Hitesh Arora <hiteshar at andrew.cmu.edu> wrote:
>>
>> > Hi Autonians,
>> >
>> > Has anyone already downloaded the GTA5 and/or Cityscapes datasets onto the
>> > Auton cluster, and willing to share. It would save a bit of time in
>> > downloading/transfer. Thanks!
>>
>> Actually I have a better idea. Before we all went into hiding I fully
>> provisioned the newest file server bought by Dr. Jeff Schneider late
>> last year.  The server has for now only one ZFS pool due to the lack of
>> HDDs but six storage HDDs which are installed are 12 TB each. Thus the
>> ZFS pool I created has 48TB of useful space even before standard lz4
>> compression ZFS uses. For the first time we even have dedicated  2 NVMe
>> ZFS mirror slog for the pool.
>>
>> https://www.ixsystems.com/blog/zfs-zil-and-slog-demystified/
>>
>> although that should only impact write speed.
>>
>> I already remote replicated /zfsauton/data and /zfsauton/project data
>> sets onto the new file server but I am scared to pull the trigger and
>> change anything in the current NFS setup before we come out of hiding.
>> However, I think it would be a very good idea that we consolidate all
>> standard third party datasets people use in the Lab in a new ZFS
>> dataset/NFS share which will be accessible to all users across all
>> computing nodes. That, in theory, should off load some of the traffic
>> from other 2 NFS servers we use for home, data, and project directories.
>> That, in theory, should also partially solve the fact that scratch
>> directories are machine specific. Hopefully that would also help us use
>> less storage space as we will solve duplication problem.
>>
>> What I need from users is the name of dataset you think we should share
>> with everyone and its current location.
>>
>> Most Kind Regards,
>> Predrag
>>
>>
>>
>> >
>> > Regards,
>> > Hitesh


More information about the Autonlab-users mailing list