When working on big data projects, extracting data from Google drive into orchestration workflows to initially store the data in data lakes followed by a series of operations like data validation, cleansing, and transformation is widely used to gather business insights from the data. In this Python Sample Code, we are going to upload files to google drive using python and use them in data flow orchestration processes.
Steps for Uploading files on Google Drive using Python
Table of Contents
- Steps for Uploading files on Google Drive using Python
- Pre-Requisites
- Step 1: Import the libraries
- Step 2: OAuth made easy
- Step 3 : Upload files to your Google Drive
- Step 4 : List out files from Google Drive
- Step 5 : Download the files from Google Drive
- Step 6 : Create the Text files in Google Drive
- Step 7: Read the content of the text file directly from Google Drive
Pre-Requisites
- Install the pydrive python module as follows: pip install pydrive
- The below codes can be run in Jupyter notebook or any python console
Get Closer To Your Dream of Becoming a Data Scientist with 70+ Solved End-to-End ML Projects
Step 1: Import the libraries
from pydrive.auth import GoogleAuth from pydrive.drive import GoogleDrive
Step 2: OAuth made easy
Follow the steps to Get Authentication for Google Service API in the below link: Get Authentication for Google Service API
Download client_secrets.json from Google API Console and OAuth2.0 is done in two lines. You can customize the behavior of OAuth2 in one settings file settings.yaml
gauth = GoogleAuth() drive = GoogleDrive(gauth)
Above steps together as follows :
from pydrive.auth import GoogleAuth from pydrive.drive import GoogleDrive gauth = GoogleAuth() drive = GoogleDrive(gauth)
Step 3 : Upload files to your Google Drive
upload_file_list = ['1.jpg', '2.jpg'] for upload_file in upload_file_list: gfile = drive.CreateFile({'parents': [{'id': '1pzschX3uMbxU0lB5WZ6IlEEeAUE8MZ-t'}]}) # Read file and set it as the content of this instance. gfile.SetContentFile(upload_file) gfile.Upload() # Upload the file.
Output of the above code:
- The above code uploads my two local files 1.jpg and 2.jpg to my Google Drive folder test/. To do that, the pydrive library will create two files in Google Drive and then read and upload the two files to the corresponding folder.
- Note that we need to provide the id of the corresponding Google Drive folder. In this example, the test folder's ID is 1pzschX3uMbxU0lB5WZ6IlEEeAUE8MZ-t. You can get the Google Drive folder ID from the browser.
- For example: when we open the test folder in my Google Drive, the browser shows the address as //drive.google.com/drive/folders/1cIMiqUDUNldxO6Nl-KVuS9SV-cWi9WLi. Then the corresponding ID for the test folder is the part after the last \ symbol, which is 1cIMiqUDUNldxO6Nl-KVuS9SV-cWi9WLi.
Step 4: List out files from Google Drive
We can also list all files from the specific folder in google drive as follows :
file_list = drive.ListFile({'q': "'{}' in parents and trashed=false".format('1cIMiqUDUNldxO6Nl-KVuS9SV-cWi9WLi')}).GetList() for file in file_list: print('title: %s, id: %s' % (file['title'], file['id']))
Output of the above code:
Step 5: Download the files from Google Drive
We can also download the files from Google Drive as follows. Note - after listing the files only we can download the file.
for i, file in enumerate(sorted(file_list, key = lambda x: x['title']), start=1): print('Downloading {} file from GDrive ({}/{})'.format(file['title'], i, len(file_list))) file.GetContentFile(file['title'])
The output of the above code:
In the above snapshot files are downloaded from the specific folder, Note here files will download where the code will be executed.
Step 6: Create the Text files in Google Drive
We can also write files directly to Google Drive using the following code:
# Create a GoogleDriveFile instance with title 'test.txt'. file1 = drive.CreateFile({'parents': [{'id': '1cIMiqUDUNldxO6Nl-KVuS9SV-cWi9WLi'}],'title': 'test.txt'}) # Set content of the file from the given string. file1.SetContentString('Hello World!') file1.Upload()
The output of the above code: test.txt file is created in google drive.
Step 7: Read the content of the text file directly from Google Drive
Also, we can read the file directly from Google Drive using the below code :
file2 = drive.CreateFile({'id': file1['id']}) file2.GetContentString('test.txt')
The output of the above code:
In the above snapshot reading the content of the file as "Hello world"