by Rob Blades


Right now, you are interacting with data. In fact, you probably interact with data almost every day. Whether you are video calling a friend, watching movies on your smartphone, uploading your vacation photos to social media, paying your bills online, or conducting research for a school project, you are using an interface to send and receive data. 

In the Research and Collections division at the Canadian Museum of Nature, we work with a lot of data. Every day, specimens are digitized, images are created, metadata (the data that describes data) is entered into databases, and data is moved around a network of systems and software.  

One of the tools we use to interface with our data is a Digital Asset Management software (DAMs) called Portfolio. A DAMs is a central repository for content or media that an organization produces. We use our DAMs for assets such as photographs taken at museum events, specimen images, and nature art.  

Alongside these digital assets is their metadata. Metadata can be generated automatically by a computer or manually entered by a person. Every file contains computer-generated metadata that describes that file, such as the file type, size, resolution, date it was created, and date it was modified. With a DAMs, a person can also add metadata to that image, such as a title, genre, description, history, and information about the creator. 

This metadata describes the digital asset, helping us to search for it more effectively and provide specific information when we share it with others. For example, the digitized nature art stored in Portfolio is shared on the Canadian Museum of Nature’s Google Arts and Culture page. The metadata we add to Portfolio to describe each image and provide the necessary context to understand each digital asset eventually makes its way online. 

Screenshot of a webpage displaying mushroom watercolour artwork and a description of the information we know about it.
An example of the metadata in Portfolio. Image: Rob Blades © Canadian Museum of Nature 

To get this data online, we interface with Portfolio to export the data for Google Arts and Culture. Like many programs, Portfolio provides the option to easily export both the images and metadata by clicking a few buttons—images are exported into pre-defined directories and metadata is exported to a tab delimited text file. Delimited text files are a common way to export data from one system to another. These files can then be used to create new exhibits on our Google Arts and Culture page. 

Screenshot of a webpage displaying mushroom watercolour artwork and a description of the information we know about it.
An example of the metadata on Google Arts and Culture. Image: Rob Blades © Canadian Museum of Nature 

Recently, we discovered an issue with some metadata in Portfolio in a collection of about 1,500 images destined for Google Arts and Culture. At some point, tab characters were introduced into some of the metadata fields. Because the only option to export metadata from the Portfolio web interface is to a tab delimited text file, the errant tabs caused our metadata file to be malformed. For you to interface with data, it must be accurate and accessible. With a malformed file, the metadata no longer connects properly to each image and displays a variety of errors. If we added the data in this state into Google Arts and Culture, it would not appear properly, making both the information incorrect and the content effectively inaccessible. We needed to clean our tab delimited metadata file to make it meaningful. 

There were several options to interface with this data to resolve our issue, but the most efficient option that fit our needs was using the Portfolio API. An API (Application Programming Interface) allows different software to communicate with each other by sending and receiving data. Many web applications provide access to an API to provide developers an easier way to access its data and run large scale tasks they could not perform as efficiently using the standard interface. 

Shot of a text editor program in dark mode with a portion of a Python programming language script.
An example of Python code. Image:  Artturi Jalli on Unsplash https://unsplash.com/photos/g5_rxRjvKmg 

To interact with the Portfolio API, we used Python, a popular programming language. With Python, we were able to interface with the API to export the metadata to JSON, a data format commonly used with APIs to send and receive data. Since JSON is structured using a series of brackets rather than character delimiters, we were able to easily find and replace the tab characters and re-import the corrected data into Portfolio. This experience also showed us many ways we could interface with Portfolio for other tasks. 

There’s a popular saying that “all roads lead to Rome.” Unsurprisingly, there are many synonymous phrases that get at the same meaning of that phrase, such as “there are many ways to crack an egg” or “there are many ways to bake a cake.” Just as there are many roads that lead to Rome and many ways to express that sentiment and interface with that idea, there are many ways to interface with data. You just need to find the one that works for you.