Currently we are using RImageFolder (a subclass of PyTorch's ImageFolder) for both dataset operations (i.e., imagelist, visualization, annotation, etc.) and training. In other words, when we are viewing / annotating images, we read from RImageFolders, and when we train, we also feed RImageFolder object directly into model for training. We might consider migrating away from this model to deal with two things separately. Specifically, we will query the database to fetch images, and only create RImageFolder objects on the fly when we run images through models in batches (e.g., when training / testing, etc.).
This is a necessary to take if we want to scale up the system. But before that, we need to create a design doc to analyze pros & cons and see how it fits into our future road map.
Currently we are using
RImageFolder(a subclass ofPyTorch'sImageFolder) for both dataset operations (i.e., imagelist, visualization, annotation, etc.) and training. In other words, when we are viewing / annotating images, we read fromRImageFolders, and when we train, we also feedRImageFolderobject directly into model for training. We might consider migrating away from this model to deal with two things separately. Specifically, we will query the database to fetch images, and only createRImageFolderobjects on the fly when we run images through models in batches (e.g., when training / testing, etc.).This is a necessary to take if we want to scale up the system. But before that, we need to create a design doc to analyze pros & cons and see how it fits into our future road map.