Joey Tuason: Data Science Portfolio

Logo

Data science and programming enthusiast with passion to explore Machine Learning models

Currently working as an actuary

An advocate for the WFH movement -- I believe this is the future

Never (and never will be) a boomer

View My LinkedIn Profile

Load Fast in Google Colab, But Careful…

We’ve all been there, Jupyter cannot install a certain package and your project deadline is three hours away. What do you do?

Well, Google colab (short for Colaboratory) is the Jupyter but the online version. We just have one major issue… how to load data?

In Jupyter, we can simply put the file in the same folder as our notebook and normally load. If the file is in a subfolder, simply change the path in your reader code. But what happens if your notebook is in Google?

We have three approaches when it comes to this, and I’ll present the first option being the best but be careful:

Use gdown

With gdown, you can use your dataset in Google drive and the code uploads file from Google drive to colab without the annoying pop-up. Follow the below instructions and be guided accordingly.

This is the drawback that if someone gets access to that link, they can download the dataset. Really not wise if you are dealing with private data. Although public folders in google drive is not searcheable, your notebook can if it is public. So if you decide go this route while using sensitive data, make sure your notebook is Restricted.

b. Get the ‘Key’

image

c. Insert this code in your Notebook and Run

Observe that upload time is super fast!

image

Voila! You got this.

Other Methods

1 Upload the file and run normally

This would be good but if you close your notebook or reach the 12-hr limit of Google colab/day, then you will need to reupload. Also, upload time takes a lot of time for large files.

image
In this image, just activate the left pane and drag the files anywhere. Then call the file in the notebook.

2 Put in a Google drive folder

This is the safe way. Although it is pesky that the Google prompt will always pop when you rerun your notebook. But hey, it’s safe!

image


These will always appear when you rerun the code… brace yourselves: image image