How to Build a Tableau Web Data Connector with OAuth

How to Build a Tableau Web Data Connector with OAuth

October 22, 2022 - Tableau, Web Data Connectors

Personal note: Yes, I'm back! I'm not going to make any promises on how often I'll update this blog but I am still here! If there's a topic you want me to cover submit a topic request on the right!

When I used to work at Tableau I would often get asked the same questions about Web Data Connectors. Most of them were "How the heck do I refresh this thing on Tableau Server or Online!?" But many of them were around how to write a WDC for an API that uses OAuth and errors they were getting related to CORS policies. In this post, I'm going to build an extremely limited WDC for Reddit to show you how to answer these questions. If you actually need a WDC for Reddit, feel free to remix the final code and add more tables or data. Or, if you want to hire me to build one for you, reach out to me on fiverr! Want to try out the finished product? Start a new WDC connection in Tableau Desktop 2019.4 or higher and connect to https://sliceofkeesh-reddit-wdc.glitch.me/.

Prerequisites

Hosting option 1: Glitch

To use a WDC to connect to data it must be hosted somewhere. If you want to get things up and running quickly you can get started by remixing my starter template on Glitch. Glitch is a free tool for developing and hosting apps collaboratively. It's also where you can find the finished code for this and many of my tutorials and resources. I recommend creating an account with Glitch so you can save your project, but it is unnecessary to follow this tutorial. Once you've remixed the starter template visit https://<your-project-name-here>.glitch.me and if you see "Hello World!" everything is good to go!

Hosting option 2: Local environment

If you'd like to follow along locally instead you can set up Node to serve your extension:

  1. Install Node.js if you don't have it already.
  2. Download and unzip my starter template.
  3. Open the terminal or command prompt within the unzipped folder and run the command npm install to install the dependencies.
  4. Next run npm start to start up your WDC.
  5. Visit http://localhost:8080 in your browser and you should see "Hello World!" if everything is set up correctly!
Tableau Desktop 2019.4+

This tutorial will use modern javascript methods and therefore requires a Tableau Desktop version of 2019.4 or higher. You can also use a recent version of Tableau Public Desktop to build & test your WDC.

A Reddit Account

In order to get data from Reddit you will need a free Reddit account.

Start the debugger

Tableau Desktop has a built-in debugging tool that can help us troubleshoot any problems we come across while developing our WDC. First, close any instances of Tableau Desktop then follow the instructions here to start Tableau Desktop with the debugger running. (Note, if you are using 2021.1 or higher you no longer need to use an old version of chrome to open the debugger.) Once you've started Tableau Desktop with the debugger go to http://localhost:8696 (or whichever port you chose) and you should see a list of "Inspectable Pages" and the "Discover Pane". Keep this open so you can debug as we go along. Once you connect to a WDC you will see it show up in this list (make sure to refresh!).

Create a Reddit Application

The Reddit API requires OAuth for authentication. OAuth is an open standard authorization framework that allows users' accounts to be used by third-party services without exposing their credentials. In order to use it the third party (the WDC developer in this example) needs to register their app with the service (in this case Reddit) and then a user can give a set of permissions to that application to act on their behalf. So before you can begin you need to register a new app with Reddit. To register a new app you only need a few inputs:

  • Name: A human-friendly name to identify your app.
  • App Type: There are three types of apps. A web app runs on your own server and can keep secrets. An installed app runs on devices (like a mobile phone) and can't keep secrets. And a script app runs on your own hardware, and can keep a secret but only has access to your account. For this example, we'll be creating a web app.
  • Description: A short description of what your app does.
  • About URL: A link to where someone could find more information about your app.
  • Redirect URI: This is probably the most important input. Once a user successfully authenticates with Reddit, their browser will be instructed to redirect to this location. You will also be asked for this URI as a part of requesting credentials and if it doesn't match exactly the request will fail. This can be any route in your WDC but for this sample, I'll be using <https://www.mywdcdomain.com>/callback

Enter all of the requested information and click create app to save. You will then see your newly created app and should take note of two values:

  • App ID: Beneath the title of your app and the type you should see an ID that looks something like p_jcoLKBwciufo.
  • App Secret: You should also see a field "secret" that has your new app's secret that looks something like abcd4567oeRg123a3hnY_ufo_w888w. Do not share this or save it somewhere public!
    Take note of both of these values as we will need them soon.

    Note: Technically you can set your redirect URI to go back to the root of your application at / but keeping things separate makes it easier to code. Also, if you redirect back to / and that is the last URL before you start gathering data, the URL including the code and state will be what's saved in your Tableau Desktop WDC history. (Not great.)

Set up the landing page

Now that we have our Reddit application set up let's dive into the WDC. Before we start coding let's set some environment variables that we'll want to use later. In this example, we definitely don't want to store our application's secret in the front end, and saving it in the .env file can be a good alternative. Let's add our client id, client secret, a random string we can use to sign cookies, and our redirect URI. Not all of these things are necessarily secrets but I like having them all in one place. It also makes copying this app and building your own on top of it easier since you can just update the .env file with your own info and get going! It should look something like this:

CLIENT_ID=p_jcoLKBwciufo
CLIENT_SECRET=abcd4567oeRg123a3hnY_ufo_w888w
REDIRECT_URI=https://sliceofkeesh-reddit-wdc.glitch.me/callback

Now let's start working on the landing page for the WDC. The landing page is where we will start the authentication process. In this sample, I won't be adding any user inputs but this is where you could add additional data filters or options on how to bring in data. Within the Express starter app, you will see we already have a few files to help us get started. The most important is the server.js file which is serving our application and already has a route defined for our index.html page at GET /. Let's start with the index.html page. First, we'll need to bring in the Tableau WDC library in order for our WDC to work, so add the following to the <head> tag of your index.html:

<script src="https://connectors.tableau.com/libs/tableauwdc-2.3.latest.js"></script>

You should already see links to your script.js and style.css files but if not add those as well:

<link rel="stylesheet" href="/style.css" />
<script src="/script.js" defer></script>

Tip: As you are developing your WDC you may notice that your code changes aren't being reflected in Tableau. This may be caused by the caching of your script.js file. One trick to stop that from happening is to add a fake version to your script like src="/script.js?v=1" and updated it whenever you've made a change.

Next, we can begin building out the page with a title and a link to authenticate. The link should take us to our own route at /authenticate which we will create next. The code within your <body> tag should look something like this:

<h1 class="title">Sample WDC for Reddit</h1>
<a class="button is-link" href="/authenticate">Log in</a>

Note, I'll be using the Bulma CSS framework for styling so this tutorial can focus on the mechanics. You can add Bulma to your project by adding this to the top of your style.css file:

@import 'https://cdn.jsdelivr.net/npm/[email protected]/css/bulma.min.css';

Create the authentication link

Now that we have the basics of our landing page we need to code what actually happens when /authenticate is visited. To do this we will move to the server.js file. Here we are going to add a new GET route for /authenticate. I won't be going over the details of Express and routes in this post so if you want to learn more take a look at their resources.

app.get('/authenticate', (req, res) => {});

Within this route, we need to craft the appropriate URL to send our users to in order for them to authenticate with Reddit. Remember that OAuth allows us to act on a user's behalf so this is the first step of getting permission from the user to access their account. When we send the user to the login page we need to pass along some information about our application, specifically:

  • client_id: This is the ID of the application you registered with Reddit earlier and tells Reddit which application is making the request.
  • response_type: For this sample, we'll be using the authorization code grant flow so here we will put code.
  • state: A string that is unique for each authorization request. This string will be sent back with the response and you can use it to verify that it matches the one you sent. For this post, I'm not going to go over using state but you can see one way of using it in the full code sample.
  • redirect_uri: The exact URI that you entered when creating your Reddit application.
  • duration: This indicates whether or not you want a permanent token. Since our users will likely want to refresh this WDC connection automatically we need a permanent way to authenticate on their behalf. If we ask for a permanent token we will receive a refresh token alongside the 1-hour access token. We can then use the refresh token to get new access tokens after our first one expires.
  • scope: There are many different things you can do with the Reddit API and for each type of action you need specific permission. Here is where we ask explicitly for the permissions, or scopes, that we need in order to pull data from the Reddit API. In our example, we want identity so we can access the user's Reddit username and history so we can access their activity history (past submissions, comments, etc.).
    Within our GET /authenticate route we can create an object with all of these inputs, format them into a query string, and then send the user to the authentication URL with a redirect:
const params = {
  client_id: process.env.CLIENT_ID,
  response_type: 'code',
  state: 'keshia',
  redirect_uri: process.env.REDIRECT_URI,
  duration: 'permanent',
  scope: 'identity history',
};
const query = new URLSearchParams(params).toString();
res.redirect(`https://www.reddit.com/api/v1/authorize?${query}`);

Note that we are pulling in some of the values we set in the environment variables with process.env.VARIABLE_NAME_HERE
Let's go ahead and test out our WDC so far. At this point, you should be able to click the link on your landing page, be taken to the Reddit login page, and then be asked to allow our application. You'll notice within the request to allow our application you will see out the permissions the application is asking for. This list is directly tied to the scopes we used when making the request. Make sure you are only requesting the scopes you actually need to make your application work. Once you click allow, Reddit will send the user back to our /callback route along with a couple of parameters. We will use the code parameter to then request an access token.

Get an access token

Now that we have the link set up we need to be able to accept the response from Reddit after the user authenticates and gives our application permission. Remember that we set the redirect URI to be at our GET /callback route so let's build that out now. The first thing we need to do is grab the code that was sent back to us.

app.get('/callback', async (req, res) => {
  const { code } = req.query;
});

For this next part of the code, I'm going to start a new reddit.js file to hold a bunch of functions to do work with the Reddit API. This makes our code a bit cleaner and also makes it easier for you if you decide to expand on this sample. The new reddit.js should be at the same level as the server.js file. In this new file, we are going to be making calls directly to the Reddit API and we will need a library to help us do that. There are a bunch of packages for making requests in Node but I like node-fetch since it mimics what I'm already used to in the front end. (You could definitely use one of the Reddit libraries out there to avoid crafting your own requests but I want to show you how you would go about this whole process whether there is an existing library for the API you're using or not.) In order to use node-fetch we'll need to install its package. If you are using glitch, simply go to the package.json file and click add package at the top. If you are running things locally then use npm install. I'll be using the latest release of v2 for node-fetch which remains compatible with CommonJS. In glitch, you can update the package.json file directly to ensure the version and when running locally you can append the version to the package name. (ex: npm install [email protected]). Once it's installed let's import it into the reddit.js file:

const fetch = require('node-fetch');

Next, we are going to create an exported getToken function that we can then import into our server.js file later. This function will take the code returned from Reddit and use it to get an access token. Once we have the access token, we can use it to get data from the Reddit API.

exports.getToken = async (code) => {};

To get the token we need to send a request to the Reddit access token endpoint along with some information about our application. Most of these inputs we've already gone over but here we have one new one called grant_type. The grant type indicates what flow we're using, here we are using the code-based flow. First, we can set the URL and set up our query parameters:

const url = 'https://www.reddit.com/api/v1/access_token';
const parameters = {
  grant_type: 'authorization_code',
  code,
  redirect_uri: 'https://sliceofkeesh-reddit-wdc.glitch.me/callback',
};
const body = new URLSearchParams(parameters).toString();

To make this request we will also need to pass our application ID and secret in the Authorization header of the request. The format is a base 64 encoded client_id:client_secret. We can use the built-in btoa() function to do this encoding.

let client_auth = btoa(`${process.env.CLIENT_ID}:${process.env.CLIENT_SECRET}`);

Next, we need to include these inputs in the appropriate parts of our request:

const options = {
  method: 'POST',
  body,
  headers: {
    'Content-Type': 'application/x-www-form-urlencoded',
    Authorization: `Basic ${client_auth}`,
  },
};

Now we have everything we need to actually make the request. You might have noticed that the function we're working on is flagged with async making it asynchronous. We will be taking advantage of that here when we send out these requests. Using the async/await pattern we can just add the await keyword before our requests to make our code wait for those promises to be resolved. To learn more about async/await and promises I recommend javascript.info's tutorial.

const response = await fetch(url, options);
const data = await response.json();

Here we are sending the request to the URL with our specific details and then parsing the response as JSON. If we don't receive an access token we want to return with an error field which we can then use to act appropriately.

if (!data.access_token) return { error: data };

At this point we could just return the data as is but I know that for the endpoints we want to use from the Reddit API we will need the username as a part of our requests. Therefore at this point, I'm going to use the token we just received and get the current user's username before returning the data. To do this I'm going to create a new function outside of my getToken function that I can just use within the reddit.js file:

async function getIdentity(access_token) {
  try {
    const url = 'https://oauth.reddit.com/api/v1/me';
    const options = {
      headers: {
        Authorization: `Bearer ${access_token}`,
      },
    };
    const response = await fetch(url, options);
    const data = await response.json();
    return data.name;
  } catch (error) {
    return { error };
  }
}

Here's what's going on:

  1. The function takes an access token as input. We're using this access token in the Authorization header as a bearer token.
  2. We send off a request to the identity endpoint GET /api/v1/me.
  3. We pull out the name from the response and return it.

Now that I have this helper function I can use it to grab the username, attach it to the access token response, and send it all back:

let username = await getIdentity(data.access_token);
data.username = username;
return data;

Now let's switch back to our server.js file. Here we want to use the new getToken function we just created. First, we need to import our *reddit.js file at the top:

const reddit = require('./reddit');

Now we can use our exported function to get a token within our GET /callback route. If we get an error from the getToken function then we can stop processing and respond with an error:

const data = await reddit.getToken(code);
if (data.error) return res.status(500).send('Failed to retrieve token.');

Now that we have our token we will need to send it back to the front end. An easy way to do this is by settings cookies. We can use a package to help us with this like cookie-parser. Again make sure to install this package first and then we can bring it into our server.js file and setup our app to use it as middleware:

const cookieParser = require('cookie-parser');

app.use(cookieParser());

At this point, we can finish out the rest of the GET /callback route by setting cookies for the access token and the refresh token we received from Reddit and returning the user back to the root at /.

res.cookie('username', data.username);
res.cookie('access_token', data.access_token);
res.cookie('refresh_token', data.refresh_token);
res.redirect('/');

Note: In a production environment you should probably sign these!

Set up the WDC

After completing the authentication flow we can now start working on the WDC itself. We're going to move to the script.js file in the public folder where we will be putting all the code for creating the connector and schema and getting data. The first thing we need to do is register the connector and set up our three main functions:

let myConnector = tableau.makeConnector();

myConnector.init = async function (initCallback) {};

myConnector.getSchema = function (schemaCallback) {};

myConnector.getData = async function (table, doneCallback) {};

tableau.registerConnector(myConnector);

Here's what's going on:

  1. We created a new connector called myConnector and started to define its three functions. init will handle any code we want to run at the start of anything we do, getSchema will be called to create the table schemas, and getData will be called when we want to actually get data. Finally, we registered the created connector with Tableau.
  2. Note that for each function Tableau provides a callback to let it know that we are done.
  3. Similarly, note that the getData function also has a table argument. The function is run per table so table is the current table object and is where we will append the data once we've collected it.

Let's take a look at the init function first. This function is called at the beginning of each phase of the WDC. Since this WDC does require authentication the first thing we need to do is set the authentication type. The none type is the default if the type is not set and is used if there is no authentication required. The basic type can be used if you simply need the user to enter a username & password. Since we are authenticating with OAuth we will need to use the custom type and provide our own way of re-entering credentials. This example is pretty basic so we can simply just show the landing page again but you can also build out a specific route for handling authentication if you want to expand on this sample.

myConnector.init = async function (initCallback) {
  tableau.authType = tableau.authTypeEnum.custom;
};

I mentioned earlier that there are different phases to a WDC and that this initialization function would be run at the beginning of each of them. Each phase is for a different part of the WDC process so we will need to check what phase we are in and then act appropriately. There are two main phases to the lifecycle of a WDC. First is the interactive phase which is the phase where our landing page is shown. Here we're able to get inputs from the user before closing the window and moving on to the second phase. The second phase is the gather data phase. At this point, the WDC window is closed and Tableau will start creating the tables requested which will be populated with data soon. There is also an authentication phase but this is really just a specific version of the interactive phase. This is used specifically when authentication credentials need to be re-entered. If you are building a more complex WDC it is a good idea to have a separate route that only handles the authentication that can be shown to the user on its own. In this example, we're not doing anything else but authenticating so it's all in one and we can combine the instructions for these two phases together:

if (
  tableau.phase == tableau.phaseEnum.interactivePhase ||
  tableau.phase == tableau.phaseEnum.authPhase
) {
}

Within the if statement for the interactive/auth phase we want to check if we have an access token in the cookies. If we don't then we haven't logged in yet and we don't need to do anything. If we do then we want to save the information and "submit" the connection which will move us to the next phase.

let cookies = {};
document.cookie.split(';').forEach(function (c) {
  let [key, value] = c.split('=');
  cookies[key.trim()] = value;
});
if (
  cookies.username &&
  cookies.username.length > 0 &&
  cookies.access_token &&
  cookies.access_token.length > 0 &&
  cookies.refresh_token &&
  cookies.refresh_token.length > 0
) {
  tableau.username = cookies.username;
  tableau.password = JSON.stringify({
    access_token: cookies.access_token,
    refresh_token: cookies.refresh_token,
  });
  tableau.connectionName = 'Your Reddit Data';
  tableau.submit();
}

Here we are using document.cookie to retrieve the cookies we set from the server and then checking if everything we need has been set. If so we then save this information to tableau.username and tableau.password. Note that it is important to save this and any other information that you will need to fetch data into one of the tableau data storage objects. Either tableau.username, tableau.password, or tableau.connectionData. Anything else, like global javascript variables or cookies, will not be available once you move to the next phase. All of these storage objects only accept string values so we can use JSON.stringify() to turn a JSON object into a string that we can parse later. tableau.password specifically should be used for any sensitive information like access tokens. This data is never written to the workbook but can be securely embedded in the data source if you choose to publish it. If you make a connection to this WDC and then close the workbook, when you reopen it Tableau will immediately start the WDC in the authentication phase where you can do whatever is needed to get credentials.

Note: Each of the two main phases is run in different instances so if you are using the debugger and you go from the interactive phase to the data-gathering phase you will need to refresh it and connect to the new instance that is being used for the second phase.

The last thing we need to do within our initialization function is to use the callback to let Tableau know we've completed our setup:

initCallback();

We will be adding a bit more to this function but let's move on to the data schema for now.

Create the table schemas

After you complete the interactive phase and move to the data-gathering phase the getSchema function will be called to set up the schema of the tables. This function is pretty straightforward but you need to create an object for every single column you want to bring in for each table. In this sample, I'm just going to have two tables, one for users' comments and one for the users' submissions. First, we'll create a column definition for the comments table like so:

const commentsCols = [
  {
    id: "id",
    alias: "ID",
    dataType: tableau.dataTypeEnum.string,
  },
  {
    id: "ups",
    alias: "Ups",
    dataType: tableau.dataTypeEnum.int,
  },
  {
    id: "downs",
    alias: "Downs",
    dataType: tableau.dataTypeEnum.int,
  },
  ...
];

The objects in this array should match the data you plan to pull from the Reddit API and each need to have an id, an alias, and a dataType. There are some other properties you can add such as default aggregation or column type as well to get a nicer experience. You can learn about these other properties and their accepted values in the WDC API reference. Next, create the table object by giving it its own id and alias and setting the columns to be the array we just created:

const comments = {
  id: 'comments',
  alias: 'Comments',
  columns: commentsCols,
};

Do the same thing for the submissions table and any other tables you would want to create with your WDC. Finally, submit all your table schemas to Tableau using the callback provided:

schemaCallback([comments, submissions]);

Get the data

The next function we need to fill in is the getData function. Here is where we will actually send off requests to get data from Reddit. We're going to be using two different endpoints for the comments and submissions table but both of them require an access token and a username. Let's start by setting up these values and our request options:

const username = tableau.username;
const access_token = JSON.parse(tableau.password).access_token;
const options = {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({ username, access_token }),
};

Remember that the tableau.password is stored as a string so you'll need to parse it back to an object first to retrieve the access token. Additionally, we are setting the method as "POST" and the content type to be JSON which will match what the server is expecting. Finally, we send the username and access token in the body of our request. Now when I say "what the server is expecting" I am talking about our server, not the servers running the Reddit API. Because here we are not going to be sending these requests directly to Reddit but through our own server. In this case, our server is acting like a proxy to make the requests to Reddit on our behalf.

A Note About CORS
Many times when working with APIs you may notice that you are blocked by CORS-related errors when you try to make requests. Cross-Origin Resource Sharing (CORS) is an HTTP mechanism that allows a server (Reddit in our case) to indicate any origins other than its own from which a browser should allow requests. Internet browsers follow a same-origin policy that restricts cross-origin (different domain, scheme, or port) HTTP requests initiated from the front end (our script.js file). This means that you are only allowed to make requests from the front end if the server includes the right CORS headers in response to a "preflight" request. All of this is to prevent Cross-Site Request Forgery (CSRF) attacks where malicious parties can use information stored in your browser to access your data. By using a back-end server to send the requests we get around these limitations since we are not sending the request from the user's browser. (There's more to it than this but hopefully this gives you a high-level understanding of why this problem happens when building WDCs.)

So let's flip back to the server.js file. Before we can accept the JSON body that we will be sent from the front end we need to make sure our server can parse incoming requests with JSON payloads. We can use the built-in middleware function that comes with Express to do this across all routes. Add this near the top of server.js with the other app.use statements:

app.use(express.json());

Now let's make two new routes to get a user's Reddit comments and submissions:

app.post('/getComments', async (req, res) => {
  const { username, access_token } = req.body;
});

app.post('/getSubmissions', async (req, res) => {
  const { username, access_token } = req.body;
});

For each of these routes, we are grabbing the username and access token from the body we sent from the front end. Next, we'll want matching functions in our reddit.js file for comments and submissions. Let's start by adding a new getComments function in reddit.js that takes a username and access token as inputs:

exports.getComments = async (username, access_token) => {};

Next, we'll want to set up our request:

const baseUrl = `https://oauth.reddit.com/user/${username}/comments?limit=100`;
const options = {
  headers: {
    Authorization: `Bearer ${access_token}`,
  },
};

But before we go straight into making the request we need to think about what will be happening here. The endpoints we will be using to get all of a user's comments and submissions will not return everything at once if there are hundreds or thousands of items. Instead, it will send the first X amount, where X is a limit that is 25 by default and can be as high as 100. Therefore we will need to loop through each page of comments or submissions and collect them as we go until there are no more to get. The way each API does its pagination is different but Reddit provides an after property in its responses that you can use to get the next set of data. So in our code, we'll start by fetching the first page without specifying values for after, then we'll grab the after value returned in the response. If it's null then there are no more items to get and we can stop our loop, otherwise, use the received after and pass it in the next request to get the next set of items. To write this loop we'll need a couple of things to set us up:

let comments = [];

let url = baseUrl;

let after = null;

Here we are creating an empty array where we will hold all the comments as we loop through them. Then we are creating a new variable that we can use for the first iteration and then can update in subsequent iterations through the loop. Finally, we have an empty variable where we can hold on to the after value we get from Reddit. Next, we'll set up a do...while loop to loop through the pages of data from Reddit. A do...while loop will run the code within it at least one time and then continue to loop if a certain condition is still true. Since we will be able to use the after value returned from Reddit to determine if there is more data to fetch we can populate our variable with each request and only keep running the loop if after isn't empty:

do {} while (after !== null);

Within the loop, we'll need to fetch data, format it in a way that Tableau can accept, and push it into our comments array. Additionally, after we've gathered the data we need to set our own after variable to be the after value that we received from Reddit and update the url to use this value as a query parameter:

const response = await fetch(url, options);
const data = await response.json();
if (!data.data.children) return { error: data };
for (let item of data.data.children) {
  comments.push({
    id: item.data.id,
    ups: item.data.ups,
    downs: item.data.downs,
    subreddit: item.data.subreddit,
    post_title: item.data.link_title,
    link: item.data.link_permalink,
    created: new Date(item.data.created * 1000).toISOString(),
    body_preview: item.data.body,
  });
}
after = data.data.after;
url = `${baseUrl}&after=${after}`;

Note that the count and names of the properties you select here must match exactly the number and ids of the columns you defined in the getSchema function. After we run through all the comments data and the loop finishes we simply return with all the comments we stored in the comments array:

return comments;

Do the same thing now for the submissions in a new getSubmissions function but of course, pull out different information from the submissions data. You can see the full code sample here.

Now that we have a function for getting comments and submissions let's use them in their respective routes within server.js:

app.post('/getComments', async (req, res) => {
  const { username, access_token } = req.body;
  const data = await reddit.getComments(username, access_token);
  res.send(data);
});

app.post('/getSubmissions', async (req, res) => {
  const { username, access_token } = req.body;
  const data = await reddit.getSubmissions(username, access_token);
  res.send(data);
});

And finally, let's bring it all the way back to the script.js where we will be calling these routes after defining the options object. Remember that the getData function is called once per table, this means we need to first identify what table is currently in play and decide which data to fetch. We can do that with a new url parameter and a switch statement:

let url = '';
switch (table.tableInfo.id) {
  case 'comments':
    url = '/getComments';
    break;
  case 'submissions':
    url = '/getSubmissions';
    break;
}

Basically, we are looking at the ID of the current table (which we set ourselves in the getSchema function), and based on that we determine which route to use to get data. Then we can go ahead with our request to get the appropriate data, add that data to the table, and call the callback to let Tableau know we're done getting data for that table:

const response = await fetch(url, options);
const data = await response.json();
if (data.error)
  return tableau.abortWithError(
    'Failure fetching data! ' + JSON.stringify(options) + JSON.stringify(data.error)
  );
table.appendRows(data);
doneCallback();

You'll notice I am handling errors using tableau.abortWithError. This is a great way to let the user know (and yourself while debugging) that something went wrong and the details of what happened. Using this method pops up an error box in Tableau with the string you pass along with it.

Refresh the tokens

If you've followed along this far you should now be able to connect to your WDC, authenticate to Reddit, bring in a table, and click Update Now to see the table populated with data. You could stop here and your WDC would work just fine on the first connection. However, if you were to publish this data source to Tableau Server or Online you wouldn't be able to refresh it because after 1 hour the access token you received would expire. Because of this, we want to add some code that will refresh that token for us without the user having to do anything manually. This is why we asked for a permanent type token in the beginning and why we also save the refresh_token in the tableau.password. While the refresh token itself cannot be used to access a user's data, it can be used to generate a new access token.

Since the getData function is run per table and we really only need to refresh the token once per extract refresh we can add this code to our init function. Right after the if statement for the interactive and auth phases and before the initCallback add a new if statement that will only run when in the data gathering phase:

if (tableau.phase == tableau.phaseEnum.gatherDataPhase) {
}

We're going to be sending a request to our back end to do the refresh for us. First, we'll grab the information we need from tableau.password, send it in the body of a POST request to our server, and finally, we will update the tableau.password with the newly minted access token we get back.

const refresh_token = JSON.parse(tableau.password).refresh_token;
const options = {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({ refresh_token }),
};
const response = await fetch('/refreshToken', options);
const data = await response.json();
if (data.error)
  return tableau.abortForAuth('Failure refreshing access token! ' + JSON.stringify(data.error));
tableau.password = JSON.stringify({
  access_token: data.access_token,
  refresh_token,
});

On the server side we're adding a new refreshToken route:

app.post('/refreshToken', async (req, res) => {
  const { refresh_token } = req.body;
  const data = await reddit.refreshToken(refresh_token);
  res.send(data);
});

Again, we're creating a new exported function to be used here. In our reddit.js file create a new refreshToken function as follows:

exports.refreshToken = async (refresh_token) => {
  const url = 'https://www.reddit.com/api/v1/access_token';
  const parameters = {
    grant_type: 'refresh_token',
    refresh_token,
  };
  let client_auth = btoa(`${process.env.CLIENT_ID}:${process.env.CLIENT_SECRET}`);
  const body = new URLSearchParams(parameters).toString();
  const options = {
    method: 'POST',
    body,
    headers: {
      'Content-Type': 'application/x-www-form-urlencoded',
      Authorization: `Basic ${client_auth}`,
    },
  };
  const response = await fetch(url, options);
  const data = await response.json();
  if (!data.access_token) return { error: data };
  return data;
};

You'll notice it is very similar to the getToken function except with different parameters being sent. This is where using a Reddit library starts to come in handy, as you build this WDC out you wouldn't want to keep repeating similar things like this or the get X requests, I'm just doing this to show you the details.

Now that everything is plumbed through from the script.js file to server.js and to redit.js you should be all set. Now, any time the WDC is used to refresh the extract it will first make a call to get a fresh token before requesting data from the Reddit API.

Wrap it up

Now you have completed all the steps to get data from the Reddit API and be able to refresh the extract programmatically in the future! I hope this tutorial was helpful in understanding how to work with OAuth and CORS when building a WDC. At this point, I encourage you to add to the code to make it useful to you. Perhaps you could:

  • Get more tables of data.
  • Add inputs that let you choose which data to bring in.
  • Add inputs that filter data by date.
  • Create Standard Connections for useful pre-joined tables.

Let me know if this sample helped you in the comments below and subscribe to see what other things I make in the future!

🌟 Load live code sample! 🖱️
Subscribe

Sign up to never miss a post!

Leave a comment

Your email address is optional and will not be published.

A picture of Keshia Rose.

About Me

Hi, I'm Keshia, an ex-Tableau Product Manager currently evangelizing Fingerprint Pro to developers who has a love for building fun random apps and helping people solve problems. Keep coming back to my blog to learn how to use Tableau APIs and developer tools or to hear about other cool tech I'm learning about!
Subscribe

Sign up to never miss a post!

Topic Request

Want me to cover a specific topic? Let me know!