Automatically Update a Google Sheet From a Hosted CSV

Automatically Update a Google Sheet From a Hosted CSV

June 2, 2021 - Automation

Update September 2022: Google increased the cell limit from 5 to 10 million.

In a previous tutorial, I showed how to bring in JSON data into a Tableau Web Data Connector. But what if you don't want to use a WDC? This could be because you use Tableau Online and don't want to deal with Bridge to refresh the data. Or perhaps your Tableau Server is locked down and you aren't allowed to safelist a WDC. Or perhaps you aren't using Tableau at all and just want an alternative way to consume data that can be found on the web. Well, one alternative is to bring that data into a Google Sheet. This is freely available for anyone to use and is a native built-in connector for Tableau Server and Tableau Online. There are plenty of pre-built apps and tools to do this but if you want to do this yourself I think one of the simplest ways is to create a standalone Google App Script. It's already a part of the Google ecosystem so it's easy to authenticate and it's fully free to use!

Prerequisites

  • Google Account - You'll need a free Google account in order to create the scripts and Google Sheet.
  • A Google Sheet - You will need to have an empty new Google Sheet created where you want to place the CSV data.
  • A Hosted CSV - Finally you'll need a CSV of data that you want to bring into Google Sheets. In this example, I'll be using US Covid data from the NY Times. (GitHub, Raw File)

Set up the Google Sheet

The first thing we need to do is create a new Sheet where we will send the data. You can use an existing sheet but I suggest at least making a new tab. To do this go to Google Sheets and click the + button to create a new blank sheet. Name your spreadsheet something and also name your destination tab something specific, you'll need this to reference the tab later on.

Create a new App Script

Now that we have a destination for the data set up we can focus on the script itself. Open Google App Scripts and click + New Project. Rename the project to something you'll remember so you can find it again later. First, we're going to set up the variables that we'll use throughout the script. Before the myFunction function add the following (Use your own input values!):

const csvUrl = 'https://raw.githubusercontent.com/nytimes/covid-19-data/master/us-states.csv';
const delimiter = ',';
const googleSheetId = '1qu26GbyU1230xK4MU7tsFFHffs45nIwyXxabcQHCzo608km';
const targetSheet = 'US Covid Data';
  1. csvURL is the URL for the CSV.
  2. delimiter is the character used to separate values in your file. The default is a comma but you can change this to another character, for example ';' for a semi-colon-separated file.
  3. googleSheetId is the ID for the entire spreadsheet where you want to send the data. You can find this by looking at the URL of the spreadsheet. For example, if the URL looked like https://docs.google.com/spreadsheets/d/1qu26GbyU1230xK4MU7tsFFHffs45nIwyXxabcQHCzo608km/edit#gid=0 then the ID would be 1qu26GbyU1230xK4MU7tsFFHffs45nIwyXxabcQHCzo608km.
  4. targetSheet is the name of the tab you want to update with the CSV data and should match what you named the destination tab in your Google Sheet.

Fetch and parse the CSV

Once we have our variables set we want to fetch the CSV. To do this we can use the UrlFetchApp service to get the data. Within the myFunction function add the following:

let response = UrlFetchApp.fetch(csvUrl);
let data = response.getContentText();

Here we are fetching the CSV and retrieving the text data.

Next, we want to parse the CSV so it can be entered correctly into the Google Sheet. To do this first we'll split each row into a separate array. Then we can loop through each row and split the data using RegEx. I suggest using a regular expression here instead of simply splitting on your delimiter since there can be times when the data includes the delimiter. For example "New York, New York". Finally, we'll strip out the quotes that are framing the strings. Once we have the array of arrays we can do some checks to make sure we actually received some data, otherwise exit the script and throw an error.

let pattern = new RegExp(delimiter + '(?=(?:(?:[^"]*"){2})*[^"]*$)');
let rows = data.split(/\r?\n/);
rows = rows.map((row) => row.split(pattern).map((cell) => cell.replace(/^"(.*)"$/, '$1')));
if (!rows || !rows.length || rows.length < 1 || !Array.isArray(rows)) {
  throw new Error('Something went wrong!');
}

Let's test what we have so far. Add console.log(rows) at the end of myFunction before the closing }. Click the floppy disk icon to save the script and then click Run. The first time you run the script you will need to authorize the script to run. Select Review permissions and then sign in with the same Google Account you are using for App Scripts. Since this is just your own personal script you may get an error that Google hasn't verified the app. Select the Advanced link in the bottom left and click "Go to (unsafe)". Notice if you read the advice it says something like "Continue only if you understand the risks and trust the developer ()." but hey that's you! You trust yourself right?! Now you should be able to Allow the script access to "Connect to an external service". Once you've completed the authorization you should see a preview of the rows in the console. Delete the console.log and move on to the next step.

Add the Sheets service

Before we can make updates to any Sheets we'll need to include the service in our script. In the left navigation pane select the + next to Services. Scroll through the list that pops up and select Google Sheets API. Keep the Identifier as Sheets and this tutorial was written for version 4. Click Add to add it to your script.

Update the Google Sheet

Now that we have the Sheets API enabled and the data is in a format that Google Sheets can accept we can actually update the Sheet. The first part of this process is to tell Google Sheets what area of the worksheet you plan on writing to. This requires knowing the column letters for the end of the data (Example: AF). Let's add a function outside of myFunction that can help us determine the column letters based on how many columns our data has:

function columnToLetter(columnNum) {
  let temp;
  let letter = '';
  while (columnNum > 0) {
    temp = (columnNum - 1) % 26;
    letter = String.fromCharCode(temp + 65) + letter;
    columnNum = (columnNum - temp - 1) / 26;
  }
  return letter;
}

The Sheet columns use a repeating alphabet for the columns. So for example, if there are 30 columns the ending column is AD which is all the way through the first 26 characters (A) plus 4 more (D). So in this function, we are using modulo to determine the leftover value after dividing by 26 (for 26 letters). Then we can convert that leftover amount using fromCharCode with an offset of 65 which starts us in the capital letter range. We then subtract that leftover amount to see if there is still more to look for.

Now that we have a nice function to use we can determine the range where we want to write to. Back inside the myFunction function before the closing } add the following:

let endingColumn = columnToLetter(rows[0].length);
let range = `${targetSheet}!A1:${endingColumn}${rows.length}`;
let valueRange = Sheets.newValueRange();
valueRange.values = rows;
Sheets.Spreadsheets.Values.clear({}, googleSheetId, `${targetSheet}!A:ZZ`);
Sheets.Spreadsheets.Values.update(valueRange, googleSheetId, range, {
  valueInputOption: 'USER_ENTERED',
});
  1. We get the number of columns by looking at how many items are in the first row and use our function to turn that into column letters.
  2. We then use that to define the full range starting in A1 using the range syntax. (Example: A1:F300).
  3. Next we create a new Value Range and set its values to the rows of data we retrieved from the CSV.
  4. Then we clear out any data that already exists. This command requires a specific range so we can use A:ZZ to clear out all rows of the first 676 columns. If you have more columns than that you may want to reconsider your process 😏.
  5. Finally we update the spreadsheet with the data.

Test the script

Let's save and run the script again. Now that we are trying to access a Google Sheet you will need to authorize this permission as well. Follow the same process as before with the Advanced option. Notice this time around the app is also requesting permission to modify your Google Sheets. Once you've authorized the script and it completes its execution you should see the output show up in your Google Sheet! If there are any errors you should see them in the execution console below so you can troubleshoot.

Set up the schedule

Once you are able to successfully run your script and see the results in Google Sheet it's time to set up a schedule for the script to run automatically. In the far left menu pane click on the alarm clock icon to open up the Triggers page. In the bottom left corner select + Add Trigger. Make sure the function to run is set to myFunction. Then select your preferred type of time-based trigger and the interval. For example, to refresh this CSV every 15 minutes you would set the type to be Minutes timer and the interval to be Every 15 minutes. Once you've configured the trigger how you like click Save.

That's it you're all set. Your Google Sheet will now be updated automatically at the interval you set. Let me know if you were able to successfully get your CSV data into Google Sheets in the comments below!

Things to note

A couple of things to keep in mind when using this script:

  1. Google Sheets has a cell limit of 10 million. This may seem like a lot but it can easily be surpassed by large CSVs. Note that this does not mean 10 million cells with data, it's just any 10 million cells even if empty. So if you are running into an error saying you have too many cells but your data doesn't actually have that many cells try deleting the extra columns in your Sheet and trying it again. Because even if your data is only in columns A through F, if you have columns G through Z visible it is also counting those cells.
  2. This script was pretty simple but it has plenty of room for more customization, such as filtering the data, manipulating or cleaning the data or adding parameters to the request URL. Hopefully, from here you are able to expand this example to fit your needs.
  3. There is no need to deploy this script, it will run just based on the trigger you set up.

Complete Script

// Set these variables to match your information
const csvUrl = 'https://raw.githubusercontent.com/nytimes/covid-19-data/master/us-states.csv'; // The URL where your CSV file lives. Not Sheets has 5 mil cell limit.
const delimiter = ','; // Default is a comma, can change if necessary.
const googleSheetId = '1qu26GbyU1230xK4MU7tsFFHffs45nIwyXxabcQHCzo608km'; // For https://docs.google.com/spreadsheets/d/1qu26GbyU1230xK4MU7tsFFHffs45nIwyXxabcQHCzo608km/edit#gid=0 ID would be "1qu26GbyU1230xK4MU7tsFFHffs45nIwyXxabcQHCzo608km".
const targetSheet = 'US Covid Data'; // Name of the tab you want to import the CSV into.

function myFunction() {
  let response = UrlFetchApp.fetch(csvUrl);
  let data = response.getContentText();
  let rows = data.split(/\r?\n/);
  let pattern = new RegExp(delimiter + '(?=(?:(?:[^"]*"){2})*[^"]*$)');
  rows = rows.map((row) => row.split(pattern).map((cell) => cell.replace(/^"(.*)"$/, '$1')));

  if (!rows || !rows.length || rows.length < 1 || !Array.isArray(rows)) {
    throw new Error('Something went wrong!');
  }

  let endingColumn = columnToLetter(rows[0].length);
  let range = `${targetSheet}!A1:${endingColumn}${rows.length}`;
  let valueRange = Sheets.newValueRange();
  valueRange.values = rows;
  Sheets.Spreadsheets.Values.clear({}, googleSheetId, `${targetSheet}!A:ZZ`);
  Sheets.Spreadsheets.Values.update(valueRange, googleSheetId, range, {
    valueInputOption: 'USER_ENTERED',
  });
}

function columnToLetter(columnNum) {
  let temp,
    letter = '';
  while (columnNum > 0) {
    temp = (columnNum - 1) % 26;
    letter = String.fromCharCode(temp + 65) + letter;
    columnNum = (columnNum - temp - 1) / 26;
  }
  return letter;
}

Learned something new? Subscribe and never miss a new post!

Subscribe

Sign up to never miss a post!

Leave a comment

Your email address is optional and will not be published.

A picture of Keshia Rose.

About Me

Hi, I'm Keshia, an ex-Tableau Product Manager currently evangelizing Fingerprint Pro to developers who has a love for building fun random apps and helping people solve problems. Keep coming back to my blog to learn how to use Tableau APIs and developer tools or to hear about other cool tech I'm learning about!
Subscribe

Sign up to never miss a post!

Topic Request

Want me to cover a specific topic? Let me know!