Some Best Practices On Building An Integration

This was originally posted to the DEV Community

Hi, I’m Andrew McIntosh. I’m a software engineer at FreshBooks. I’ve moved around development teams a couple of times, but right now I’m working on our API and Integrations team, trying to make things better for developers looking to build API integrations. If that’s you, or you want that to be you, here are five bits of advice on how you can build an (or build a better) API integration:

  1. Get To Know The API
  2. Use Libraries, Tools, and SDKs
  3. Limit The Scope of Requests
  4. Properly Handle Errors
  5. Do More Asynchronous Work

Get To Know The API

This might seem obvious, but read the docs of whatever API you’re working with. This is really the starting point on figuring out how you can do the thing you want to do. Reading through, you might even find a better way to do something than you initially thought. A good API is intuitive and predictable, but even good APIs can have odd, unexpected behaviour in places. Look at examples too as they can really help in cases where the documentation is dated or sparse.

Keeping API documentation up to date is not always easy, so if you find some place where it’s wrong, someone will be very happy if you report it.

Use Libraries, Tools, and SDKs

A company with an API wants it to be used, so often they invest time and effort into making things that will make your life using it easier (like this!). Take advantage of the work they did for you and consider using libraries or SDKs they’ve provided. If a company doesn’t have an SDK, or if what they have doesn’t fit your language or framework of choice, take a look for a 3rd-party solution (often API docs will list a bunch of these in addition to company-build ones).

Using an SDK or library isn’t generally required, but there are advantages to not having to reinvent the wheel. Many of the topics I’m going to cover next might already be handled by a well-built SDK! Often SDK developers know the API quite well (especially if they’re employees of the company) and they can often simplify, clarify, or even paper over those oddities or inconsistencies I mentioned in documentation.

For example, for historical reasons a lot of the oldest accounting endpoints in FreshBooks’ API return dates in North American Eastern Time (US/Eastern aka EST/EDT, time zones are complicated), while all newer endpoints to return dates in UTC (we’re moving everything to UTC, but it takes time). If you’re using one of our SDKs, we hide that from you and return all the dates in UTC. We’ve done the work so YOU don’t have to figure out which is which!

Just like documentation, if you find a bug or missing feature in a tool, the owner would love your feedback, and if it’s open source and you’re keen, you could even try to fix it yourself.

Limit The Scope of Requests

For your own benefit, as well as the API owner’s you want to fetch data in an efficient way. This means not grabbing data that you don’t care about and will just throw away, as well as not fetching too much data at once and then having memory or performance issues trying to process it that can spill over into slow responsiveness for your users.

This means that you should look at how an API handles filters (to limit returned records to only those you care about), pagination (fetch records in smaller batches so you can handle them in chunks), sorting (so those batches come in an order that works for you), and maybe even what fields are included in the response (so the record itself isn’t filled with data you don’t need). Here is FreshBooks’ Search, Paging and Includes documentation.

For example, one integration I was debugging would sync invoices from another service to a particular client. It would check to see if that existed before creating it:

const matchingClients = await freshbooksClient.clients.list(accountId)
const matchingClient = matchingClients.data.clients.find(
    (currClient: Client) => currClient.organization === clientOrganization
)

But it was sometimes creating duplicate clients. This was because FreshBooks defaults to returning 30 records at a time with the newest ones first. This worked fine when it first created the client, but as customers used the app and made more clients, the client to sync with got bumped off of the first 30 results and was no longer found. In addition to that, the code was fetching as many clients as it could, and then filtered them in memory. It either needed a filter or pagination (or both).

const clientSearchQueryBuilder = new SearchQueryBuilder().like('Organization_like', clientOrganization)

const matchingClients = await freshBooksApi.clients.list(
    currentUser.fbAccountId,
    [searchQueryBuilder]
);
if (matchingClients.data.clients.length > 0) {
    const matchingClient = matchingClients[0]
}

Let the API do that work!

Properly Handle Errors

There’s a lot to proper error handling, so let’s look at things in pieces.

API Errors

A lot of API docs will have information on error codes, states, messages. You don’t want your integration to break unexpectedly, so you should look to handle these. Logging response messages is really helpful when building an application to understand business rules or validation failures. When you’re up and running in production, it’s equally important to help you know why something might be failing. For example, a good API won’t just give you a 422 Unprocessable Entity, but might return a message like 422 - At least one field among first_name, last_name, email or organization is required (a FreshBooks client validation error message)

The API isn’t your code, and you don’t control what comes out. This is a black box, and it’s helpful to follow defensive coding practices. You should check http response codes, wrap your calls in try/catches, etc.. Don’t assume that a response you get has an object or resource data structure as a failure may return an error data structure instead, so you should be prepared to parse or otherwise handle either. If a REST APIs backend service is down, a company’s API gateway might not even return you JSON but instead give you an HTML error page that your JSON-expecting code will just fail to parse. In short, don’t assume that the response will always come in the form you expect, and ensure you handle cases when it doesn’t.

Timeouts and Connection Exceptions

The API you’re using isn’t perfect. It could be slow or offline. In these cases, your integration could run into issues if all the processors or workers are stuck waiting on the API. For this reason, you should configure timeouts on your calls. HTTP clients have easily set timeouts but often they are not enabled by default. Again, a good SDK will have some sane defaults for you (we default to 30 seconds but let you easily override).

In general there are connect timeouts (how long you wait to establish a connection), and read timeouts (how long you wait for the response). Some clients let you set them together or individually, but you’ll want to set both to protect from an unresponsive server (connect), or a really slow response (read).

Figuring out exactly what the timeout values should be is very dependent on your integration and the API you’re using, but even setting them to something high (15-30 seconds) can save you a lot of pain.

Rate Limits

Most APIs will limit traffic to prevent a bad actor from hindering other integration’s performance or degrading the entire system. You should build your integration to respect and accommodate these limits. Running up against them constantly doesn’t do you any good, as your work won’t get done faster, and could result in your integration being banned.

As mentioned above, it’s best to build your integration with efficient calls in mind. Reducing the number of calls you need to make means you’re less likely to hit call limits.

Another good practice is to properly handle rate limit errors (generally HTTP 429 errors), and then rather than retry right away, wait for a bit before making the next call. If the next call is still limited, wait even longer. This is called an exponential backoff, and again, most HTTP libraries will have a way to enable this and many SDKs will have implemented this (our SDKs do it here and here).

Do More Asynchronous Work

If your integration is handling a lot of data via an API, you should consider moving as much work to asynchronous tasks as possible. In this way you can not block your users, manage and throttle your work loads, and easily retry failures. The above advice on handling rate limits gets a lot easier if your processing code isn’t blocking user actions while retrying. It also lets you throttle the calls in whatever queue system you’re using.

Look into Amazon’s SQS, Google’s Cloud Tasks queues, Python’s celery and something like CloudAMQP, or Bull with Redis. There are a lot of options out there.

Another good idea is to use webhooks if the API supports them (FreshBooks does). This allows you to register to receive messages when an event happens. Rather than polling an API every 5 minutes to see if a new resource has been created, you can tell the API to send you a message when that happens. This can save you a lot of calls and overhead.

Going back to that old integration I was debugging, it would sync invoices with FreshBooks every 20 minutes, but the process involved gathering up all the invoices it hadn’t pushed and looping through them. However, the process was driven by a synchronous HTTP call that would timeout after just a couple minutes, killing the process. It could take days to move everything over. Calling the process more often would only help a little as there were only so many calls the service could handle at one time and these long-running calls could block workers from handling user interaction with the app. We redesigned the whole sync process of the integration so that everything was done in small asynchronous tasks. On first signup we had an async task that would fetch each invoice (paginated), and put a message on the queue to process that invoice. The invoice processing tasks could thus be retried, scaled up, or throttled as needed. We also utilized webhooks for real-time updates. Each event received would just create another invoice process task. Instead of days, things could be processed in seconds, minutes, or perhaps hours for very large workloads.

Go Out And Build

Well, I hope that gives you a few ideas on building a robust integration. If you know the API you’re using and the tools available, keep your calls efficient, handle the unexpected gracefully, and keep your time consuming processing away from the user’s actions, you’re on a great path to succeed. If you have any questions, please reach out to me, and if you have anything related to FreshBooks’ API you can email newapi@freshbooks.com.

Comments