RSS

API Transparency News

These are the news items I've curated in my monitoring of the API space that have some relevance to the API definition conversation and I wanted to include in my research. I'm using all of these links to better understand how the space is testing their APIs, going beyond just monitoring and understand the details of each request and response.

Cautiously Aware Of How APIs Can Be Used To Feed Government Privatization Machinations

I’m working to define APIs at the Department of Veterans Affairs (VA) on several fronts right now. I’ve provided not just one, but two detailed reponses to their Lighthouse API platform RFIs. Additionally, I’m providing guidance to different teams who are working on a variety of projects occurring simultaneously to serve veterans who receive support from the agency.

As I do this work, I am hyper aware of the privatization machinations of the current administration, and the potential for APIs to serve these desires. APIs aren’t good, bad, nor are they neutral. APIs reflect the desires of their owners, and the good or bad they can inflict is determined by their consumers. Because APIs are so abstract, and are often buried within web and mobile applications, it can be difficult to see what they are truly doing on a day to day basis. This is why we have API management solutions in place to help paint a picture of activity in real time, but as we’ve seen with the Facebook / Cambridge Analytica, it requires the API provider to be paying attention, and to possess incentives that ensure they will actually care about paying attention, and responding to negative events.

By re-defining existing VA internal systems as APIs, we are opening up the possibility that digital resources can be extracted, and transferred externally. Of course, if these systems remain the sources of data, VA power and control can be retained, but it also opens up the possibilities for power to eventually be transferred externally–reversing the polarities of government power and putting it into private hands. Depending on your politics, this could be good, or it could be bad. Personally, I’m a fan of a balance being struck, allowing government to do what it does best, and allowing private, commercial entities to help fuel and participate in what is happening–both with proper oversight. This reflects the potential benefits APIs can bring, but this good isn’t present by default, and could shift one way or the other at any moment.

Early on, APIs appeared as a promising way to deliver value to external developers. They held the promise of delivering unprecedented resources to developers, thing they would not normally be able to get access to. Facebook promised democratization using it’s social platform. While silently mining and extracting value from it’s application end-users, executed by often unknowing, but sometimes complicit developers. A similar transfer of power has occurred with Google Maps, which promised unprecedented access to maps for developers, as well as web and mobile maps for cities, businesses, and other interests. Over time we’ve seen that Google Maps APIs have become very skilled at extracting value from cities, business, and end-users, again executed unknowingly, and knowingly by developers. This process has been playing out over and over throughout the last decade of the open data evolution occurring across government at all levels.

Facebook and Google have done this at the platform level, extracting value and maintaining control within their operations, but the open data movement, Facebook and Google included, have been efficiently extracting value from government, and giving as little back as it possibly can. Sure, one can argue that government benefits from the resulting applications, but this model doesn’t ensure that the data and content stewards within government retain ownership or control over data, and benefit from revenue generated from the digital resources–leaving them struggling with tight budgets, and rarely improving upon the resources. This situation is what I fear will happen at the Department of Veterans Affairs (VA), allowing privatization supporters to not actually develop applications that benefit veterans, but really focusing on extracting value possessed by the VA, capturing it, and generating revenue from it, without ever giving back to the VA–never being held accountable for actually delivering value to veterans.

I have no way of testing the political motivations of people that I am working with. I also don’t want to slow or stop the potential of APIs being used to help veterans, and ensure their health and insurance information can become more interoperable. I just want to be mindful of this reality, and revisit it often. Openly discussing it as I progress in my work, and as the VA Lighthouse platform evolves. Sometimes I regret all the API evangelism I have done, as I witness the damage they can do. However, the cat is out of the bag, and I’d rather be involved, part of the conversation, and pushing for observability, transparency, and open dialogue about the realities of opening up government digital resources to private entities via APIs. Allowing people who truly care about serving veterans to help police what is going on, maximize the value and control over digital resources into the hands of people who are truly serving veteran’s interests, and minimizing access by those who are just looking to make money at veteran’s expense.


Government Has Benefitted From Lack Of Oversight At The Social API Management Layers As Well

There are many actors who have benefitted from Facebook not properly management their API platform. Collecting so many data points, tracking so many users, and looking the other way as 3rd party developers put them to use in a variety of applications. Facebook did what is expected of them, and focused on generating advertising revenue from allowing their customers to target the many data points that are generated by the web and mobile applications developed on top of their platform.

I hear a lot of complaining about this being just about democrats being upset that it was the republicans who were doing this, when in reality there are so many camps that are complicit in this game, if you are focusing in on any single camp you are probably looking to shift attention and blame from your own camp, in my opinion. I’m all for calling everyone out, even ourselves for being so fucking complicit in this game where we allow ourselves to be turned into a surveillance marketplace. Something that benefits the Mark Zuckerbergs, Cambridge Analyticas, and the surveillance capitalist platform, but also providing a pipeline to government, and law enforcement.

If government is going to jump on the bandwagon calling for change at Facebook, they need to acknowledge that they’ve been benefitting from the data free for all that is the Facebook API. We’ve heard stories about 3rd party developers building applications for law enforcement to target Black Lives Matters, all the way up to Peter Thiel’s (a Facebook Investor) Palantir who actively works with NSA, and other law enforcement. If we are going to shine a light on what is happening on Facebook, let’s make it a spotlight and call out ALL of the actors who have been sucking user’s data out of the API, and better understand what they have been doing. I’m all for a complete audit of how the RNC, DNC, and EVERY other election, policing, and government entity has been doing with social data they have access to.

Let’s bring all of the activity out of the shadows. Let’s take EVERY Facebook application and publish as part of a public database, with an API for making sense of who is behind each one. Let’s require every application share the name’s of individuals and business interests behind their applications, as well as their data retention, storage, and sharing practices. Let’s make the application database observable by the public, journalists, and researchers. If we are going to understand what is going on with our data, we need to turn the lights on in the application warehouse that is built on the Facebook API. Let’s not just point the finger at one group, let’s shine a spotlight on EVERYONE, and make sure we are actually being honest at this point in time, or we are going to see this same scenario play out every fiver years or so.


Facebook, Cambridge Analytica, And Knowing What API Consumers Are Doing With Our Data

I’m processing the recent announcement by Facebook to shut off the access of Cambridge Analytica to it’s valuable social data. The story emphasizes the importance of real time awareness and response to API consumers at the API management level, as well as the difficulty in ensuring that API consumers are doing what they should be with the data and content being made available via APIs. Access to platforms using APIs is more art than science, but there are some proven ways to help mitigate serious abuses, and identify the bad actors early on, and prevent their operation within the community.

While I applaud Facebook’s response, I’m guessing they could have taken more action earlier on. Their response is more about damage control to their reputation, after the fact, than it is about preventing the problem from happening. Facebook most likely had plenty of warning signs regarding what Aleksandr Kogan, Strategic Communication Laboratories (SCL), including their political data analytics firm, Cambridge Analytica, were up to. If they weren’t than that is a problem in itself, and Facebook should be investing in more policing of their API consumers activity, as they claim they are doing in their release.

If Aleksandr Kogan has that many OAuth tokens for Facebook users, then Facebook should be up in his business, better understanding what he is doing, where his money comes from, and who is partners are. I’m guessing Facebook probably had more knowledge, but because it drove traffic, generated ad revenue, and was in alignment with their business model, it wasn’t a problem. They were willing to look the other way with the data sharing that was occurring, until it became a wider problem for the election, our democracy, and in the press. Facebook should have more awareness, oversight, and enforcement at the API management layer of their platform.

This situation I think highlights another problem of doing APIs, and ensuring API consumers are behaving appropriately with the data, content, and algorithms they are accessing. It can be tough to police what a developer does with data once they’ve pulled from an API. Where they store it, and who they share it with. You just can’t trust that all developers will have the platform, and it’s end-users best interest in mind. Once the data has left the nest, you really don’t have much control over what happens with it. There are ways you can identify unhealthy patterns of consumption via the API management layer, but Aleksandr Kogan’s quizzes probably would appear as a normal application pattern, with no clear signs of the relationships, and data sharing going on behind the scenes.

While I sympathize with Facebook’s struggle to police what people do with their data, I also know they haven’t invested in API management as much as they should have, and they are more than willing to overlook bad behavior when it supports their bottom line. The culture of the tech space supports and incentivizes this type of bad behavior from platforms, as well as consumers like Cambridge Analytica. This is something that regulations like GDPR out of the EU is looking to correct, but the culture in the United States is all about exploitation at this level, that is until it becomes front page news, then of course you act concerned, and begin acting accordingly. The app, big data, and API economy runs on the generating, consuming, buying, and selling of people’s data, and this type of practice isn’t going to go away anytime soon.

As Facebook states, they are taking measures to reign in bad actors in their developer community by being more strict in their application review process. I agree, a healthy application review process is an important aspect of API management. However, this does not address the regular review of applications usage at the API management level, assessing their consumption as they accumulate access tokens, to more user’s data, and go viral. I’d like to have more visibility into how Facebook will be regularly reviewing, assessing, and auditing applications. I’d even go so far as requiring more observability into ALL applications that are using the Facebook API, providing a community directory that will encourage transparency around what people are building. I know that sounds crazy from a platform perspective, but it isn’t, and would actually force Facebook to know their customers.

If platforms truly want to address this problem they will embrace more observability around what is happening in their API communities. They would allow certified and verified researchers and auditors to get at application level consumption data available at the API management layer. I’m sorry y’all, self-regulation isn’t going to cut it here. We need independent 3rd party access at the platform API level to better understand what is happening, otherwise we’ll only see platform action after problems occur, and when major news stories are published. This is the beauty / ugliness of APIs. The cats out of the bag, and platforms need them to innovate, deliver resources to web, mobile, and device applications, as well as remain competitive. APIs also provide the opportunity to peek behind the curtain, and better understand what is happening, and profile the good and the bad actors within each ecosystem–let’s take advantage of the good here, to help regulate the bad.


Facebook Cambridge Analytica And Knowing What Api Consumers Are Doing With Our Data

404: Not Found


Insecurity Around Providing Algorithmic Transparency And Observability Using APIs

I’m working on a ranking API for my partner Streamdata.io to help quantify the efficiencies they bring to the table when you proxy an existing JSON web API using their service. I’m evolving an algorithm they have been using for a while, wrapping it in a new API, and applying it across the APIs I’m profiling as part of my API Stack, and the Streamdata.io API Gallery work. I can pass the ranking API any OpenAPI definition, and it will poll and stream the API for 24 hours, and return a set of scores regarding how real time the API is, and what the efficiency gains are when you use Streamdata.io as a proxy for the API.

As I do this work, I find myself thinking more deeply about the role that APIs can play in helping make algorithms more transparent, observable, and accountable. My API ranking algorithm is pretty crude, but honestly it isn’t much different than many other algorithms I’ve seen companies defend as intellectual property and their secret sauce. Streamdata.io is invested in the ranking algorithm and API being as transparent as possible, so that isn’t a problem here, but each step of the process allows me to think through how I can continue to evangelize other algorithm owners to use APIs, to make their algorithms more observable and accountable.

In my experience, most of the concerns around keeping algorithms secret stem from individual insecurities, and nothing actually technical, mathematical, or proprietary. The reasons for the insecurities are usually that the algorithm isn’t that mathematically sophisticated (I know mine isn’t), or maybe it is pretty flawed (I know mine is currently), and people just aren’t equipped to admit this (I know I am). I’ve worked for companies who venomously defend their algorithms and refuse to open them up, because in the end they know they aren’t defensible on many levels. The only value the algorithm possesses in these scenarios is secrecy, and the perception that there is magic going on behind the scenes. When in reality, it is a flawed, crude, simple algorithm that could actually be improved upon if it was opened up.

I’m not insecure about my lack of mathematical skills, or the limitations of my algorithm. I want people to point out its flaws, and improve upon my math. I want the limitations of the algorithm to be point out. I want API providers and consumers to use the algorithm via the API (when I publish) to validate, or challenge the algorithmic assumptions being put forth. I’m not in the business of selling smoke and mirrors, or voodoo algorithmics. I’m in the business of helping people understand how inefficient their API responses are, and how they can possibly improve upon them. I’m looking to develop my own understanding of how can make APIs more event-driven, real time, and responsive. I’m not insecure about providing transparency and observability around the algorithms I develop, using APIs–all algorithm developers should be as open and confident in their own work.


More Outputs Are Better When It Comes To Establishing An API Observability Ranking

I’ve been evolving an observability ranking for the APIs I track on for a couple years now. I’ve bene using the phrase to describe my API profiling and measurement approach since I first learned about the concept from Stripe. There are many perspectives floating around the space about what observability means in the context of technology, however mine is focused completely on APIs, and is more about communicating with external stakeholders, more than it is just about monitoring of systems. To recap, the Wikipedia definition for observability is:

Formally, a system is said to be observable if, for any possible sequence of state and control vectors, the current state can be determined in finite time using only the outputs (this definition is slanted towards the state space representation). Less formally, this means that from the system’s outputs it is possible to determine the behavior of the entire system. If a system is not observable, this means the current values of some of its states cannot be determined through output sensors.

Most of the conversations occurring in the tech sector are focused on monitoring operations, and while this is a component of my definition, I lean more heavily on the observing occurring beyond just internal groups, and observability being about helping keep partners, consumers, regulators, and other stakeholders be more aware regarding how complex systems work, or do not work. I feel that observability is critical to the future of algorithms, and making sense of how technology is impacting our world, and APIs will play a critical role in ensuring that the platforms have the external outputs required for delivering meaningful observability.

When it comes to quantifying the observability of platforms and algorithms, the more outputs available the better. Everything should have APIs for determining the inputs and outputs of any algorithm, or other system, but there should also be operational level APIs that give insight into the underlying compute, storage, logging, DNS, and other layers of delivering technological solutions. There should also be higher level business layer APIs surrounding communication via blog RSS, Twitter feeds, and support channels like email, ticketing, and other systems. The more outputs around platform operations there are, the more we can measure, quantify, and determine how observable a platform is using the outputs that already exist for ALL the systems in use across operations.

To truly measure the observability of a platform I need to be able to measure the technology, business, and politics surrounding its operation. If communication and support exist in the shadows, a platform is not observable, even if there are direct platform APIs. If you can’t get at the operational layer like logging, or possibly Github repositories used as part of continuous integration or deployment pipelines, observability is diminished. Of course, not all of these outputs should all be publicly available by default, but in many cases, there really is no reason they can’t. At a minimum there should be API access, with some sort of API management layer in place, allowing for 3rd party auditors, and analysts like me to get at some, or all of the existing outputs, allowing us to weigh in on an overall platform observability workflow.

As I continue to develop my API observability ranking algorithm, the first set of values I calculate are the number of existing outputs an API has. Taking into consideration the scope of core and operational APIs, but also whether I can get at operations via Twitter, Github, LinkedIn, and other 3rd party APIs. I find many of these channels more valuable for understanding platform operations, than the direct APIs themselves. Chatter by employees, and commits via Github can provide more telling signals about what is happening, than the intentional signals emitted directly by the platform itself. Overall, the more outputs available the better. API observability is all about leveraging existing outputs, and when companies, organizations, institutions, and government agencies are using solutions that have existing APIs, they are more observable by default, which can have a pretty significant impact in helping us understand the impact a technological solution is having on everyone involved.


Hiding APIs In Plain Sight

I’m always surprised by how secretive folks are. I know that it is hard for many folks to be as transparent as I am with my work, but if you are doing public APIs, I have a basic level of expectation that you are going to be willing to talk and share stories publicly. I regularly have conversations with enterprise folks who are unwilling to talk about what they are doing on the record, or allow me to share stories about their PUBLIC API EFFORTS!!! I get the super secret internal stuff. I’ve worked for the government. I don’t have a problem keeping things private when they should be, but the secretive nature of companies around public API efforts continues to keep me shaking my head.

People within enterprise groups almost seem paranoid when it comes to people keeping an eye on what they are up to. I don’t doubt their competitors keep an eye on what they are doing, but thinking that people are watching every move, everything that is published, and will be able to understand what is going on, and be able to connect the dots is borderline schizophrenic. I publish almost everything I do public by default on Github repositories, and my readers, clients, and other folks still have trouble finding what I am doing. You can Google API Evangelist + any API topic and find what I’m working on each day, or you can use the Github search to look across my repositories, and I still have an inbox and social messaging full of requests for information.

My public by default stance has done amazing things for my search engine and social media presence. I don’t even have to optimize things, and I come up for almost every search. I get regular waves of connections from folks on topics ranging from student art to the government, because of my work. The schema, API definitions, documentation, tests, and stories for any API I am working on is public from day one of the life cycle. Until I bring it together into a single landing page, or public URL, the chances someone will stumble across it, and leverage my work before I’m ready for it to be public is next to zero. The upside is that when I’m ready to go public with a project, by the time I hit publish, and make available via the primary channels for my network, things are well indexed, and easily found via the usual online channels.

I feel like much of the competitive edge that enterprise groups enjoy is simply because they are secretive. There really isn’t a thing there. No secret sauce. Just secret. I find that secret tends to hide nothing, or at least is hiding incompetency, or shady behavior. I was talking with a big data group the other day that was looking for a contractor that was skilled in their specific approach to doing APIs. I asked for a link to their public APIs, so I could assess this specific approach, and they declined, stating that things private. Ok, so you want me to help find someone who knows about your API data thing, is well versed in your API data thing, but I can’t find this API data thing publicly, and neither can anyone else? I’m sorry, this just doesn’t make sense. How can your API data thing ever really be a thing, if nobody knows about it? It just all seems silly to me.

Your API, it’s documentation, and other resources can be public, without the access to your API being public. Even if someone can mimic your interface, they still don’t have all your data, content, and algorithmic solutions. You can vet every consumer of your API, and monitor what they are doing, this is API management 101. You can still protect your valuable digital assets while making them available for discovery, and consideration by potential consumers. You can make controlled sandbox environments available for talent acquisition, building of prototypes, crafting visualizations, doing analysis, and other ways businesses benefit from doing APIs. My advice to companies, institutions, organizations, and government agencies looking to be successful with APIs is stop being so secretive, and start hiding everything you are doing with your public APIs out in plain sight.


API Providers Should Provide Observability Into Government Developer Accounts

I’ve talked about this before, but after reading several articles recently about various federal government agencies collecting, and using social media accounts for surveillance lately, it is a drum I will be beating a lot more regularly. Along with the transparency reports we are beginning to see emerge from the largest platform providers, I’d like to start seeing more observability regarding which accounts, both user and developer are out of government agencies. Some platforms are good at highlighting how government of all shapes and sizes are using their platform, and some government agencies are good at showcasing their social media usage, but I’d like to understand this from purely an API developer account perspective.

I’d like to see more observability into which government agencies are requesting API keys. Maybe not specific agencies ad groups, and account details, although that would be a good idea as well down the road. I am just looking for some breakdown of how many developer accounts on a platform are government and law enforcement. What does their API consumption look like? If there is Oauth via a platform, is there any bypassing of the usual authentication flows to get at data, any differently than regular developers would be accessing, or requiring user approval? From what I am hearing, I’m guessing that there are more government accounts out there than platforms either realize, or are willing to admit. It seems like now is a good time to start asking these questions.

I would add on another layer to this. If an application developer is developing applications on behalf of law enforcement, or as part of a project for a government agency, there should be some sort of disclosure at this level as well. I know I’m asking a lot, and a number of people will call me crazy, but with everything going on these days, I’m feeling like we need a little more disclosure regarding how government(s) are using our platforms, as well as their contractors. The transparency disclosure that platforms have been engaging is a good start to the legal side of this conversation, but I’m looking for the darker, more lower level surveillance that I know is going on behind the scenes. The data gathering on U.S. citizens that doesn’t necessarily violate any particular law, because this is such new territory, and the platform terms of service might sanction it in some loopholy kind of way.

This isn’t just a U.S. government type of thing. I want this to be standard practice for all forms of government on the platforms we use. A sort of UN level, General Data Protection Regulation (GDPR). Which reminds me. I am woefully behind on what GDPR outlines, and how the rolling out of it is going. Ok, I’ll quick ranting now, and get back to work. Overall, we are going to have to open up some serious observability into how the online platforms we are depending are being accessed and use by the government, both on the legal side of things, as well as just general usage. Seems like the default on the general usage should always be full disclosure, but I’m guessing it isn’t a conversation anyone is having yet, which is why I bring up. Now we are having it. Thanks.


I Do Not Fear AI, I Fear The People Doing AI

There is a lot of FUD out there when it comes to artificial intelligence (AI) and machine learning (ML). The tech press enjoy yanking people’s chain when it comes to the dangers of artificial intelligence. AI is coming for your jobs. AI is racist, sexist, and biased. AI will be lead to World War III. AI will secure and protect us from the bad out there. AI will be the source of all of our worries, and the solution to all of our worries. I’m interested in the storytelling around all of this, and I’m fascinated by the distracting quality of technology when it comes to absolving the humans behind of doing bad things.

We have the technology to make this black boxes more observability and accountable. The algorithms feeding us news, judging us in courtrooms, and deciding if we are insurable or a risk, can all be wrapped with APIs, and made more accountable. However, there are many human reasons why we don’t do this. Every AI out there can be held accountable, it isn’t rocket science. The technology exists to keep AI from hurting us, judging us, and impacting our lives in negative ways. However, it is the people behind who do not want it, otherwise their stories won’t work. Their stories won’t have the desired effect and control over our lives.

APIs are the layer being wielded for good and for bad on the Internet. Facebook, Twitter, and Reddit, all leverage APIs to be available on our mobile phones. APIs are how people automate, advertise, and fund their activities on their platforms. APIs are how AI and ML are being exposed, wielded, and leveraged. The technology is already there to make them more accountable, we just don’t have the human will to use the technology we have. There is more money to be made in telling wild stories about what is possible. Telling stories that make folks afraid, and in awe of what is possible with technology. APIs are used to tell you the stories, while also making the fire shoot from the stage, and the smoke and the mirrors operate, instead of helping us see, understand, and verify what is going on behind the scenes.

We rarely discuss the fact that AI isn’t coming for our jobs. It is the people behind the AI, at the companies developing, deploying, and operating AI that are coming for our jobs. AI, like APIs, are neither good, nor bad, nor neutral–they are a tool. They are technology, and anything they do is because of us humans. I don’t fear AI. I only fear the people doing AI. The people who tell the stories. The people who are believers. I don’t fear technology because I know we have the tools to do what is right, and hold the people who are using technology in bad ways accountable. I’m afraid because we don’t seem to have the will to look behind the curtain. We hold up many of the people telling stories about AI as visionaries, leaders, and truth tellers. I don’t fear AI, I only fear its followers.


Your Internal Dysfunction Is Not My API Problem

You hear a lot of discussion regarding public API vs private API. From my vantage point there is only web APIs that use public DNS, but I find that folks hung up on the separation usually have many other hangups about things they like to keep behind the firewall, and under the umbrella of private. These are usually the same folks who like to tell me that my public API stories don’t apply to them, and when you engage these folks in any ongoing fashion you tend to find that they are looking to keep a whole lot of dysfunction out of view from the public, and all the talk really has very little to do with APIs.

I spend my days studying the best practices across the leading API providers, and understanding what is working and what is not working when it comes to operating APIs. I have seven years of research I’m happy to share with folks, and I entertain a number requests to jump on calls, participate in webinars, do hangouts, and go onsite to do workshops and talks. I’m usually happy to do these things, and when it is a government agency, non-profit organization, and sometimes higher educational institutions, I am happy to these things at no charge. I like sharing what I know, and letting folks decide what they can use from the knowledge I’ve aggregated.

When I engage with folks I expect folks to not always be trusted–they don’t know me. However, I’m always surprised when folks think I have an agenda, looking to change them too fast, that I’m trying to shove something down their throat, and disrupt their position. First, I am always being invited in. I’m not a sales guy. I do not have anything to sell you except for my knowledge (which revenue goes to just goes back into doing what I do). There is regularly the quiet IT person who has carefully defended their position in the company and what they do not know for years, telling me this is all horse shit, and who do I think I am. Or the administrator who is always running around with their head cut-off, and feels the need to tell me that I do not understand them, and there is no way that any knowledge that I have is at all applicable to them–what the hell am I even doing here?

Hey, I’m just looking to share. If you don’t want it, I’m happy to tell me stories elsewhere, you don’t have to jump my shit. I’m genuinely trying to help, and share stories about what is working in other situations. I’m not looking to take your job, or make you do things you don’t have time or capacity for. Your dysfunction is not my API problem. I’m guessing this dysfunction is probably why you don’t have any sort of public API presence, and why your existing internal and partner APIs are not everything you’d like them to be. I am just going to quietly pick up my things and go back to what I was doing, I’m sorry that I’ve disrupted your chaotic world, and took up your time. I most likely won’t be back, but if you ever need to learn anything about APIs, and need a break, you know where to find me.

Note: If my writing is a little dark this week, here is a little explainer–don’t worry, things will back to normal at API Evangelist soon.


My Focus On Public APIs Also Applies Internally

A regular thing I hear from folks when we are having conversations about the API lifecycle, is that I focus on public APIs, and they are more interested in private APIs. Each time I hear this I try to take time and assess which parts of my public API research wouldn’t apply to internal APIs. You wouldn’t publish your APIs to pubic API search engines like APIs.io or ProgrammableWeb, and maybe not evangelizing your APIs at hackathons, but I’d say 90% of what I study is applicable to internal APIs, as well as publicly available APIs.

With internal APIs, or private network partner APIs you still need a portal, documentation, SDKs, support mechanisms, and communication and feedback loops. Sure, how you use the common building blocks of API operations that I track on will vary between private and public APIs, but this shifts from industry to industry, and API to API as well–it isn’t just a public vs. private thing. I would say that 75% of my API industry research is derived from public API operations–it is just easier to access, and honestly more interesting to cover than private ones. The other 25% of internal API conversation I’m having, always benefit from thinking through the common API building blocks of public APIs, looking for ways they can be more successful with internal and partner APIs.

I’d say that a significant part of the mindshare for the microservices philosophy is internally focused. I think this is something that will come back to hurt some implementations, cutting out many of the mechanisms and common elements required in a successful API formula. Things like portals, documentations, SDKs, testing, monitoring, discovery, support, communications all contribute to an API working, or not working. I’ve said it before, and I’ll say it again. I’m not convinced that there is any true separation in public vs private APIs, and there remains to be a great deal we can learn from public API providers, and put to work across internal operations, and with our most trusted partners.


Observability In Botnet Takedown By Government On Private Infrastructure

I'm looking into how to make [API security](http://security.apievangelist.com) more transparent and observable lately, and looking for examples of companies, institutions, organizations, politicians, and the government are calling for observability into wherever APIs are impacting our world. Today's example comes out of [POLITICO's Morning Cybersecurity email newsletter](http://www.politico.com/tipsheets/morning-cybersecurity), which has become an amazing source of daily information for me, regarding transparency around the take down of bot networks. _"If private companies cooperate with government agencies - for example, in the takedown of botnets using the companies' infrastructure - they should do so as publicly as possible, [argued the Center for Democracy & Technology]( https://www.ntia.doc.gov/files/ntia/publications/cdt-ntia-nistcommentsbotnetsfinal.pdf) . "One upside to compulsory powers is that they presumptively become public eventually, and are usually overseen by judges or the legislative branch," CDT argued in its filing. "Voluntary efforts run the risk of operating in the dark and obscuring a level of coordination that would be offensive to the general public. It is imperative that private actors do not evolve into state actors without all the attendant oversight and accountability that comes with the latter."_ I've been tracking on the transparency statements and initiatives of all the API platforms. At some point I'm going to assemble the common building blocks of what is needed for executing platform transparency, and I will be including these asks of the federal government. As the Center for Democracy & Technology states this relationship between the public and private sector when it comes to platform surveillance needs to be more transparent and observable in all forms. Bots, IoT, and the negative impacts of API automation needs to be included in the transparency disclosure stack. If the government is working with platform to discover, surveil, or shutdown bot networks there should be some point in which operations should be shared, including the details of what was done. We need platform [transparency](http://transparency.apievangelist.com) and [observability](http://observability.apievangelist.com) at the public and private sector layer of engagement. Sure, this sharing of information would be time sensitive, respecting any investigations and laws, but if private sector infrastructure is being used to surveil and shut down U.S. citizens there should be an accessible, audit-able log for this activity. Of course it should also have an API allowing auditors and researchers to get all relevant information. Bots are just one layer of the API security research I'm doing, and the overlap in the bot world when it comes to API transparency, observability, and security is an increasingly significant vector when it comes to policing, [surveillance](http://surveillance.apievangelist.com), but also when it comes to protecting the privacy and safety of platform people (citizens).


Learning About Real-Time Advertising Bidding Transparency Using Ads.txt

I was learning about real-time bidding transparency using Ads.txt from Lukasz Olejnik. The mission of the ads.txt project is to “increase transparency in the programmatic advertising ecosystem. Ads.txt stands for Authorized Digital Sellers and is a simple, flexible and secure method that publishers and distributors can use to publicly declare the companies they authorize to sell their digital inventory.” While Ads.txt isn’t an API, it is an open, machine readable definition that is working to make advertising more transparent and observable to everyone, not just people in the ad-tech space.

Ads.txt works similar to robots.txt, and is a simple text file that lives in the root of a domain, listing the companies that have permission to sell advertising. The format is new, so there isn’t a lot of adoption yet, but you can see one in action over at the Washington Post. Helping make platforms observable is why I perform as the API Evangelist. I see them as one of the important tools we have for making systems, algorithms, and platforms more observable, and less of a black box. I see ads.txt having similar potential for the world of advertising, and something that eventually could have an API, for helping make sense of the very distributed, brokered, and often dark world of online advertising.

Honestly, I know almost nothing about online advertising. I have a basic level of understanding of Google Analytics, Adwords, and Adsense, as well as reading the blogs, and documentation for many advertising APIs I come across in my regular monitoring of the API space. I am just interested in ads.txt as an agent of observability, and pulling back the current on who is buying and selling our digital bits online. I am adding ads.txt to my API definitions research. This will allow me to keep an eye on the project, see where it goes, and think about the API level for aggregation of the ad network data, on maybe Github or something–caching ads.txt on a daily basis. Who knows where it will go. I’ll just keep learning, and playing around until it comes into a little more focus.

I’m guessing that not all companies will want to be transparent like this. Folks who have the upper hand, or are just doing shady things, rarely like to have the curtain pulled back. I’ll keep an eye ads.txt adoption, and also spend some time thinking thinking about transparency and observability at the API monetization and plans level, going beyond just advertising, which is the staple of the web, and think about how we can keep things open when it comes to API business models.


Diagramming The Components Of API Observability

I created a diagram of the politics of APIs sometime ago that has really held true for me, and is something I’ve continue to reference as part of my storytelling. I wanted to do a similar thing to help me evolve my notion of API observability. Like the politics of APIs, observability overlaps many areas of my API life cycle research. Also like the politics of APIs, observability involves many technical, business, and legal aspects of operating a platform online today.

Here is my first draft of a Venn diagram beginning to articulate what I see as the components of API observability:

The majority of the API observability conversation in the API space currently centers around logging, monitoring, and performance–driven by internal motivations, but done in a way that is very public. I’m looking to push forward the notion of API observability to transcend the technical, and address the other operational, industry, and even regulatory concerns that will help bring observability to everyone’s attention.

I do not think we should always be doing API, AI, ML and the other tech buzzwords out there if we do not have to–saying no to technology can be done. In the other cases where the answer is yes, we should be doing API, AI, and ML in an observable way. This is my core philosophy. The data, content, algorithms, and networks we are exposing using APIs, and using across web, mobile, device, and network applications should be observable by internal groups, as well as partners, and public stakeholders as it makes sense. There will be industry, community, and regulatory benefits for sectors that see observability as a positive thing, and go beyond just the technical side of observability, and work to be more observable in all the areas I’ve highlight above.


Algorithmic Observability In Predictive Policing

As I study the world of APIs I am always on the lookout for good examples of APIs in action so that I can tell stories about them, and help influence the way folks do APIs. This is what I do each day. As part of this work, I am investing as much time as I can into better understanding how APIs can be used to help with algorithmic transparency, and helping us see into the black boxes that often are algorithms.

Algorithms are increasingly driving vital aspects of our world from what we see in our Facebook timelines, to whether or not we would commit a crime in the eyes of the legal system. I am reading about algorithms being used in policing in the Washington Monthly, and I learned about an important example of algorithmic transparency that I would like to highlight and learn more about. A classic argument regarding why algorithms should remain closed is centered around intellectual property and protecting the work that gives you your competitive advantage–if you share your secret algorithm, your competitors will just steal it. While discussing the predictive policing algorithm, Rebecca Wexler explores the competitive landscape:

But Perlin’s more transparent competitors appear to be doing just fine. TrueAllele’s main rival, a program called STRmix, which claims a 54 percent U.S. market share, has an official policy of providing defendants access to its source code, subject to a protective order. Its developer, John Buckleton, said that the key to his business success is not the code, but rather the training and support services the company provides for customers. “I’m committed to meaningful defense access,” he told me. He acknowledged the risk of leaks. “But we’re not going to reverse that policy because of it,” he said. “We’re just going to live with the consequences.”

And remember PredPol, the secretive developer of predictive policing software? HunchLab, one of PredPol’s key competitors, uses only open-source algorithms and code, reveals all of its input variables, and has shared models and training data with independent researchers. Jeremy Heffner, a HunchLab product manager, and data scientist explained why this makes business sense: only a tiny amount of the company’s time goes into its predictive model. The real value, he said, lies in gathering data and creating a secure, user-friendly interface.

In my experience, the folks who want to keep their algorithms closed are simply wanting to hide incompetence and shady things going on behind the scenes. If you listen to individual companies like Predpol, it is critical that algorithms stay closed, but if you look at the wider landscape you quickly realize this is not a requirement to stay competitive. There is no reason that all your algorithms can’t be wrapped with APIs, providing access to the inputs and outputs of all the active parts. Then using modern API management approaches these APIs can be opened up to researchers, law enforcement, government, journalists, and even end-users who are being impacted by algorithmic results, in a secure way.

I will be continuing to profile the algorithms being applied as part of predictive policing, and the digital legal system that surrounds it. As with other sectors where algorithms are being applied, and APIs are being put to work, I will work to find positive examples of algorithmic transparency like we are seeing from STRmix and HunchLab. I’d like to learn more about their approach to ensuring observability around these algorithms, and help showcase the benefits of transparency and observability of these types of algorithms that are impacting our worlds–helping make sure everyone knows that black box algorithms are a thing of the past, and the preferred approach of snake oil salesman.


Algorithmic Observability In Predictive Policing

As I study the world of APIs I am always on the lookout for good examples of APIs in action so that I can tell stories about them, and help influence the way folks do APIs. This is what I do each day. As part of this work, I am investing as much time as I can into better understanding how APIs can be used to help with algorithmic transparency, and helping us see into the black boxes that often are algorithms.

Algorithms are increasingly driving vital aspects of our world from what we see in our Facebook timelines, to whether or not we would commit a crime in the eyes of the legal system. I am reading about algorithms being used in policing in the Washington Monthly, and I learned about an important example of algorithmic transparency that I would like to highlight and learn more about. A classic argument regarding why algorithms should remain closed is centered around intellectual property and protecting the work that gives you your competitive advantage–if you share your secret algorithm, your competitors will just steal it. While discussing the predictive policing algorithm, Rebecca Wexler explores the competitive landscape:

But Perlin’s more transparent competitors appear to be doing just fine. TrueAllele’s main rival, a program called STRmix, which claims a 54 percent U.S. market share, has an official policy of providing defendants access to its source code, subject to a protective order. Its developer, John Buckleton, said that the key to his business success is not the code, but rather the training and support services the company provides for customers. “I’m committed to meaningful defense access,” he told me. He acknowledged the risk of leaks. “But we’re not going to reverse that policy because of it,” he said. “We’re just going to live with the consequences.”

And remember PredPol, the secretive developer of predictive policing software? HunchLab, one of PredPol’s key competitors, uses only open-source algorithms and code, reveals all of its input variables, and has shared models and training data with independent researchers. Jeremy Heffner, a HunchLab product manager, and data scientist explained why this makes business sense: only a tiny amount of the company’s time goes into its predictive model. The real value, he said, lies in gathering data and creating a secure, user-friendly interface.

In my experience, the folks who want to keep their algorithms closed are simply wanting to hide incompetence and shady things going on behind the scenes. If you listen to individual companies like Predpol, it is critical that algorithms stay closed, but if you look at the wider landscape you quickly realize this is not a requirement to stay competitive. There is no reason that all your algorithms can’t be wrapped with APIs, providing access to the inputs and outputs of all the active parts. Then using modern API management approaches these APIs can be opened up to researchers, law enforcement, government, journalists, and even end-users who are being impacted by algorithmic results, in a secure way.

I will be continuing to profile the algorithms being applied as part of predictive policing, and the digital legal system that surrounds it. As with other sectors where algorithms are being applied, and APIs are being put to work, I will work to find positive examples of algorithmic transparency like we are seeing from STRmix and HunchLab. I’d like to learn more about their approach to ensuring observability around these algorithms, and help showcase the benefits of transparency and observability of these types of algorithms that are impacting our worlds–helping make sure everyone knows that black box algorithms are a thing of the past, and the preferred approach of snake oil salesman.


Budget Transparency At The County Level With Open Data And APIs

As we find ourselves darker times when it comes to transparency within the federal government in the United States, I’m always on the hunt for any positive signs of transparency at the other levels of government. I can usually depend on my friends over at Socrata to help out, and they came through this week with a story on Douglas County, Kansas publishing their budget online.

“This week, Douglas County, Kansas, home to 117,000 residents, launched an Open Budget site, which provides the public with access to one of the county’s most crucial documents: the annual operating budget.” Jill Jolicoeur, Assistant to the County Administrator stated that “our goal is for Open Budget to replace the time-intensive process of developing and publishing the annual budget in a hard-copy format.” Open data and APIs is one way for resource-strapped companies can open things up, allowing external groups help share the load when it comes to auditing, analysis, visualization, and many other areas county government could use some assistance.

Douglas County provides us with another example of how we can make government more observable. There is no reason that county government can’t open up their budgets like this, and let a little sunlight into the operations. In my experience, the primary reason you want to keep things closed is when you are looking to hide corruption, incompetence, and poor technological and business decisions. I’m concerned with the bar being set by the White House right now when it comes to transparency and observability, but it doesn’t mean we can’t resist at the local level, and leverage open data and APIs to get things done more efficiently–like Douglas County is doing, and Socrata is enabling.


API Wrappers To Help Bring Machine Learning Into Focus

I was taking a look at the Tensorflow Object Detection API, and while I am interested in the object detection, the usage of API is something I find more intriguing. It is yet another example of how diverse APIs can be. This is not a web API, but an API on top of a single dimension of the machine learning platform TensorFlow.

“The TensorFlow Object Detection API is an open source framework built on top of TensorFlow that makes it easy to construct, train and deploy object detection models.” It is just a specialized code base helping abstract away the complexity of one aspect of using TensorFlow, specifically for detecting objects in images. You could actually wrap this API with another web API and run on any server or within a single container as a proper object recognition API.

For me, it demonstrates one possible way of wrapping a single or cross section of a machine learning implementation to abstract away the complexity and helping you train and deploy ML models in this particular area. This approach to deploying an API on top of ML shows that you can use to APIs to help simplify and abstract ML for developers. This can be done to help satisfy business, regulatory, privacy, security, and other real or perceived concerns when it comes to artificial intelligence, machine learning, or any other digital voodoo that resembles magic.

No matter how complex the inputs and outputs of an algorithm are, you can always craft an API wrapper, or series of API wrappers that help others make sense of those inputs, from a technical, business, or even political perspective. I just wanted to highlight this example of ML being wrapped with an API, even if it isn’t for all the same reasons that I would be doing it. It’s just part of a larger toolbox I’m looking to create to help me make the argument for more algorithmic transparency in the machine learning platforms we are developing.


Transparency Around Every Company Who Has Access To Our Social Data Via An API

I believe that APIs can bring some important transparency to the web, mobile, and device applications that seem to be invading our life. I hesitate using the word transparency because it has been weaponized by Wikileaks and others in the current cyber(in)secure landscape, but for the purposes of this story, it will work. APIs by default do not mean transparency, but when done in the right way they can pull back the curtain a little on what is going on when a company, organization, institution, or agency behind is truly committed to transparency.

I’ve long had a portion of my research dedicated to studying intentional transparency efforts by API providers, giving me a place to publish any organizations, links, and stories that I publish on the subject of API transparency. As part of my API research I was looking at some university API efforts the other day when I came across the Apply Magic Sauce API, a personalisation engine that accurately predicts psychological traits from digital footprints of human behavior, which had a pretty interesting section dedicated to the subject of transparency. Here is some background on their approach:

Our methods have been peer-reviewed and published in open access journals since 2013, and new services that sound similar to Apply Magic Sauce API (AMS) are springing up every day. As this technology becomes more accessible and its impact increases, we would like to ensure that citizens have clarity on who we do and do not work with. We are therefore committed to keeping an up to date list of every organisation that we have formally authorised to use AMS for commercial purposes. These clients are advised to follow our ethical guidelines and are bound by our terms and conditions regarding the need to obtain the informed consent of individuals about whom predictions are made. We encourage other providers of predictive technologies to honour the principles of privacy, transparency and relevance and publish a similar list of their own.

The Apply Magic Sauce API provides me with a solid example of an existing institution, “…committed to keeping an up to date list of every organization that we have formally authorised to use AMS for commercial purposes”, and encouraging their partners to “publish a similar list of their own”. What an important example to set when it comes to APIs, especially APIs that involve the amount of personally identifiable information (PII) that a social network possesses. It provides a positive model that ANY application who allows for the OAuth’ing of a user via their Twitter, Facebook, or another social network should be emulating.

Honestly, I’m still not 100% sure about what they are up to at the University of Cambridge with their Apply Magic Sauce API–I am still getting going with my dive into their operations. However, from the amount of work they’ve put into the ethics behind their API platform, as well as the high bar for transparency being set, I’m willing to give them the benefit of the doubt that they are up to good things. I’ll keep diving into their API, and monitor the activity in the community to make sure they are standing behind their pledge, and see whether or not their API partners are respecting the pledge as well.

I am hoping that the University of Cambridge provides me with a solid ethical example of not just how an API provider can behave and communicate around the transparency of their API consumption, but also how they can set the same expectations within their API community. If I had my way this would be the standard operating procedure for EVERY company, organization, institution, and a government agency that possess any PII for ANY citizen–100% transparency of EVERY single partner, customer, and application developer who has access to any personal data.


A Checklist For API Observability

I have had the Wikipedia page for Observability open in a browser tab for weeks now. I learned about the concept from Stripe a while back and is something that I am looking to help define APIs from an external vantage point. In this world of fake news and wild promises of artificial intelligence and machine learning, we need these black boxes to be as observable as they can--I am hoping that APIs can be one of the tools in this observability toolbox.

Stripe is approaching this concept from the inside, with a focus on stability and reliability of their API operations. I am focusing on this concept from the outside, to "measure how well internal states of a system can be inferred by knowledge of its external outputs". More of a regulatory, auditing, and journalistic way of thinking, but in the API way of doing things. Some of this is about understanding, but some of it is also about holding providers accountable for what they are peddling.

The other day I mapped out what API monitoring means to me, rebranding it as what I'd prefer to call API awareness. I went through the elements I use to understand what is going on with APIs across the sector, but this time I am thinking about them in terms of observability. Meaning, not what I'm doing to be aware of APIs, but what is expected from providers to meet (my) minimum viable definition of a platform being observable.

  • Discovery - Do APIs exist? Are they easily discoverable? Are they public? Can anyone put them to use?
     
    • Applications - Find new applications built on top of APIs.
    • People - People who are doing interesting things with APIs.
    • Organization - Any company, group, or organization working with APIs.
    • Services - API-centric services that might help solve a problem.
    • Tools - Open source tools that put APIs to work solving a problem.
  • Versions - What versions are currently in use, what new versions are available, but not used, and what future versions are planned and on the horizon.
  • Paths - What paths are available for all available APIs, and what are changes or additions to this stack of resources.
  • Schema - What schema are available as part of the request and response structure for APIs, and available as part of the underlying data model(s) being used. What are the changes?
  • SDKs - What SDKs are available for the APIs I'm monitoring. What is new. What are the changes made regarding programming and platform develop kits?

    • Repositories - What signals are available about an SDK regarding it's Github repository (ie. commits, issues, etc.)
    • Contributors - Who are the contributors.
    • Forks - The number of forks on an SDK.
  • Communication - What is the chatter going on around individual APIs, and across API communities. We need access to the latest messages from across a variety of channels.

    • Blog - The latest from each API blog.
    • Press - Any press released about APIs.
    • Twitter - The latest from Twitter regarding API providers.
    • Facebook - The latest Facebook posts from providers.
    • LinkedIn - The latest LinkedIn posts from providers.
  • Issues - What are the current issues with an API, either known by the provider or possibly also reported from within the community.
  • Change Log - What changes have occurred to an API, that might impact service or operations.
  • Road Map - What planned changes are in the road map for a platform, providing a view of what is coming down the road.
  • Monitoring - What are the general monitoring statistics for an API, outlining its overall availability.
  • Testing - What are the more detailed statistics from testing APIs, providing a more nuanced view of API availability.
  • Performance - What are the performance statistics providing a look at how performant an API is, and overall quality of service.
  • Authentication - What are all of the authentication approaches available and in-use. What updates are there regarding keys, scopes, and other maintenance elements of authentication.
  • Security - What are the security alerts, notifications, and other details that might impact the security of services, or the approach taken by a platform to making sure a platform is secure.
  • Terms of Service - What are the changes to the terms of service, or other events related to the legal operations of the platform.
  • Licensing - What licenses are in place for the API, its definitions, and any code and tooling put to use, and what are the changes to licensing.
  • Patents - Are there any patents in play that impact API operations, or possibly an entire industry or area of API operations.
  • Logging - What other logging data is available from API consumption, or other custom solutions providing other details of API operations.
  • Plans - What are the plans and pricing in existence, and what are the tiers of access--along with any changes to the plans and pricing in the near future.
  • Analysis - What tools and services are available for further monitoring, understanding, and deriving intelligence from individual APIs, as well as across collections of APIs.
  • Embeddables - What embeddable tooling are available for either working with individual APIs, or across collections of APIs, providing solutions that can be embedded within any website, or application.
  • Visualizations - What visualizations are available for making sense of any single API or collections of APIs, providing easy to understand, or perhaps more complex visualizations that bring meaning to the dashboard.
  • Integration - What integration platform as a service (iPaaS), continuous integration, and other orchestration solutions are available for helping to make sense of API operations within this world.

I do not think APIs being present automatically mean a company, it's data, content, or algorithms are observable. This list represents what I feel enables and empowers observability, and are the key areas where I think companies need to work on when it comes to ensuring their operations are more observable. I pulled this list of elements from my own toolbox for monitoring the API sector but is trimmed down specifically for platform operators who might be thinking about how they can make things more observable for everyone involved, internal or external. 

If you are like me and operating outside the corporate firewall, this list represents the areas you should be encouraging platform operators to invest in when it comes to pulling back the current a little bit more. Different folks will read this post and walk away with different takes on what observability might mean. Some will view it as a good thing, while others will see it is as API driven regulatory interference. Your take probably has more to do with your position in the conversation, than it does to the observability of anything. 

Now for some concrete examples. What is the observability of the Facebook news algorithm? What is the observability of the Twitter feed? What is the observability of our credit score? How observable is COMPAS as it evaluates its risk score for whether a criminal offender will reoffend? How observable is Chicago's food inspection algorithm? How observable is the RSA algorithm which drives cryptography? How observable is algorithmic trading across stock markets? These are just a handful of some relatable algorithms when in reality there are numerous micro impacts of algorithms felt each moment, of each day, as we navigate our worlds.

I do not think there are easy answers when it comes to algorithmic transparency. This is just my attempt to understand how APIs can be put to work making the algorithms that are increasingly governing our reality more observable. I spend a lot of time trying to make sense of the very obscure things like cloud and social. What are they? What can you do with them? Do they do what they promise? When do things change? I shared a version of this list for folks on the outside to develop API awareness. Now,, this is my recommendations for what companies, organizations, institutions, and government agencies can do to make the data, content, and algorithms they use more observable--not just for our benefit, for yours as well.


APIs Can Give An Honest View Of What A Company Does

One of the reasons I enjoy profiling APIs is that they give an honest view of what a company does, absent of all the marketing fluff, and the promises that I see from each wave of startups. If designed right, APIs can provide a very functional, distilled down representation of data, content, and algorithmic resources of any company. Some APIs can be very fluffy and verbose, but the good ones are simple, concise, and straight to the point.

As I'm profiling the APIs for the companies included in my API monitoring research, what API Science, Apica, API Metrics, BMC Software, DataDog, New Relic, and Runscope offer quickly become pretty clear. A simple list of valuable resources you can put to use when monitoring your APIs. Crafting an OpenAPI Spec allows me to define each of these companies APIs, and easily articulate what it is that they do--minus all the bullshit that often comes with the businesses side of all of this. 

I feel like the detail I include for each company in an APIs.json file provides a nice view of the intent behind an API, while the details I put into the OpenAPI Spec provide insight into whether or not a company actually has any value behind this intent. It can be frustrating to wade through the amount of information some providers feel they need to publish as API documentation, but it all becomes worth it once I have the distilled down OpenAPI Spec, giving an honest view of what each company does.


The Netflix Public API Was The Most Successful API Failure Ever

I have written about the continuing Netflix API story over the years, which despite shuttering their public API, is an API effort that just keeps giving to the public. It is an API story that better reflects the reality of APIs, not the bullshit version you often get from the tech sector, and in my opinion is the most successful public API failure of all times--definitely one of the API stories I'll catalog in my history of APIs.

Netflix's recent story about the engineering trade-offs and the Netflix API re-architecture was just the latest reminder of how transparent Netflix is with their API operations, even though they do not have a public API--shifting the traditional discussions around open vs. closed APIs. Netflix has been sharing the story of their API operations like they were available to the public, even going as far as open sourcing much of the technology behind their operations.

There are many layers to the definition of what is an open API. There are many incentives for why you would want to be transparent with your APIs. I don't expect that all companies have fully public APIs, it's just not in the DNA of every company, or possible in all industries. However, I do think that companies should be as transparent as possible, especially if they are a publicly traded company. Netflix's reasons are always evident in the last sentence of their posts, "we are hiring for several different roles", but that is another story.


 


Pushing For More Algorithmic Transparency Using APIs

I saw the potential for collaboration when it came to using web APIs back around 2004 and 2005. I was seeing innovative companies opening up their digital assets to the world using low-cost, efficient Internet technology like HTTP, opening things up for potentially interesting approaches to collaboration around the development of web and mobile applications on top of valuable digital resources. This approach has brought us valuable platforms like Amazon Web Services and SalesForce. 

Common API discussions tend to focus on providing APIs to an ecosystem of developers and encouraging the development of web and mobile applications, widgets, visualizations, and other integrations that benefit the platform. In the course of these operations, it is also customary to gather feedback from the community and work to evolve the APIs design, available resources, and even the underlying data model--extending collaboration to also be about the APIs, and underlying resources, in addition to just building things on top of the API.

This approach to designing, defining, and deploying APIs, and then also web and mobile applications on top of these APIs is nothing new, and is something that I have been tracking on for over the last six years. The transparency that can be injected into the evolution of data, content, and potentially the "algorithms behind" with APIs is significant, which is how it became such a big part of my professional mission, and fueling my drive to spread the "gospel" whenever and wherever I can. 

Ok, so how can APIs contribute to algorithmic transparency? To fully grasp where I am taking this, you need to understand that APIs can be used as an input and output for data, content, as well as algorithms. Let's use Twitter as an example. Using Twitter and the Twitter API I can read and write data about myself, or any user, using the /account and /users API endpoints--providing the content and data portion of what I am talking about.

When it comes to the algorithm portion, Twitter API has several methods, such as GET statuses/user_timelineGET statuses/home_timeline and GET search/tweets, which return a "timeline of Tweet data". In 2006 this timeline was just the latest Tweets from the users you follow, in sequential order. In 2016, you will get "content powered by a variety of signals". In short, the algorithm that drives the Twitter timeline is pretty complicated, with a number of things to consider:

  • Your home timeline displays a stream of Tweets from accounts you have chosen to follow on Twitter. 
  • You may see suggested content powered by a variety of signals. 
  • Tweets you are likely to care about most will show up first in your timeline. 
  • You may see a summary of the most interesting Tweets you received since your last visit
  • You may also see content such as promoted Tweets or Retweets in your timeline.
  • Additionally, when we identify a Tweet, an account to follow, or other content that's popular or relevant, we may add it to your timeline.

There are a number of considerations that would go into any one timeline response--this is Twitter's algorithm. While I technically have access to this algorithm via three separate API endpoints, there really isn't much algorithmic transparency present, beyond their overview in the support section. Most companies are going to claim this is their secret sauce and their intellectual property. That is fine, I don't have a problem with y'all being secretive about this, even though I will always push you to be more open, as well as leave the API layer out of your patents you use to pretect your algorithms.

Algorithmic transparency with APIs is not something that should be applied to all APIs in my opinion, but for regulated industries, and truly open API solutions, transparency can go a long way, and bring a number of benefits. All Twitter (and any other API provider) has to do is add parameters, and corresponding that open up the variables of the underlying algorithm for each endpoint. What goes into considerations about "what I care about", constitutes "interesting", and what makes things "popular or relevant"? Twitter will never do this, but other API providers can.

It is up to each API provider to decide how transparent they are going to be with their algorithms. The ideal solution when it comes to transparency is that the algorithm is documented and shared along with supporting code on Github, like Chicago, did for their food inspection algorithm. This opens up the algorithm, and the code behind for evaluation by 3rd parties, potentially improving upon it, as well as validating the logic behind--potentially opening up a conversation about the life of the algorithm.

There are a number of common reasons I have seen for companies and developers not opening up their algorithms:

  • It truly is secret sauce, and too much was invested to just share with the world.
  • It is crap, and the creator doesn't want anyone to know there is nothing behind.
  • There are malicious things going on behind the scenes that they do not want to be public.
  • Insecurities about coding abilities, security practices and logic applied to the algorithms.
  • Exist in competitive space with lots of bad actors, and may want to limit this behavior.
  • What is accomplished isn't really that defensible, and the only advantage is to keep hidden.

I have no problem making an argument for algorithmic transparency when it comes to regulated industries, like financial, healthcare, and education. I think it should be default in all civic, non-profit, and other similar scenarios where the whole stack should just be open sourced, and available on Github. You won't find me pushing back to hard on the startups unless I see some wild claims about the magic behind, or I see evidence of exploitation, then you will hear me rant about this some more.

Algorithmic transparency can help limit algorithmic exploitation and the other shady shit that is going on behind the scenes on a regular basis these days. I have added an algorithm section to my research, and as I see more talk about the magic of algorithms, and how these amazing creations are changing our world--I am going to be poking around a bit, and probably asking to see more algorithmic transparency when I think it makes sense.


The Opportunity For API-Driven Algorithmic Transparency At The Mobile Data Plan Level

API Evangelist is focused on helping push for sensible API-driven transparency wherever I can get it. When done in sensible ways an API can crack open the often black box that is the algorithm, giving us access and more control over our online experience.

One of the most significant algorithmic bottlenecks that govern our daily lives is our mobile data plans. All of our mobile phones are governed at the data plan level--this is where the telecom companies make their money, throttling the bits and bytes we depend on each day. 

The mobile data plan is a great place to discuss the algorithmic and data transparency that APIs can assist with, and one example of this in action is with the Google Mobile Data Plan API. Google wants more access at this level to improve the quality of experience for end-users when using mobile applications like Youtube, which can be severely impacted by data plan limits, while also significantly impacting your data plan consumption if not optimized.

There is so much opportunity for discussion between mobile network operator, API providers, developers, consumers at the data plan level. I know mobile network operators would rather keep this a black box, so they can maximize their revenue, but when you crack the network layer open with a publicly available API, there will be a number of new revenue opportunities.

Data plans affect all of us, every single day. We need more transparency into the algorithms that meter, limit, and charge us at the mobile data plan layer. We need the platforms we depend on each day to have more tools to optimize how applications consume (or do not consume) this extremely (seemingly) finite, and valuable resource (thanks, telcos!). 

I've added an algorithms area to my research to keep an eye on this topic, curate stories I find, and share my own thoughts when it comes to algorithmic transparency using APIs.


Upstream Transparency Is A Serious Concern But Downstream Not So Much

My ongoing stories around how APIs can help make business, educational, healthcare, government, and the other transactions that we are increasingly seeing move online, be more transparent--enjoy a regular amount of snickers, subtweets, and laughs from folks in the space. Whether it is about opening up access to our social media dtaa, all the way up to how our home or student loans are hammered out, to how international trade agreements are crafted--I feel APIs have a role to play when opening things up (yeah I know I'm biased). 

With that said, APIs are NOT the answer in all scenarios. I do not see them as the aboslute solution, but they do offer promise when it comes to giving individuals a voice in how social media, financial, healthcare, and other data and content is used, or not used. A lot of people who are part of the startup machine love to laugh at me, pointing out how uninformed I am about how things work. We have to keep things private, and in the hands of those who are in the know--this is how business works, and everything would come to a screeching halt if we forced everything to open up.

While there is a wide variety of motivations behind why people poke at me about my persepective, I am noticing one important thing. They disagree with me when I point out things that are downstream from them, but they agree with me when it is upstream from them. Meaning if I focus on federal agencies like the IRS, or possibly banking on Wall Street, and other regulatory and legal influences -- they are all about transparency. If it is downstream from them with their existing customers, or would be customers for their future startups --hell no, transparency will always hurt the things. Just leave it to the smart, cool kids to make the important decisions, the average person shouldn't worry there little head about it..

This is privilege. You are unable to see people below the position you enjoy in society. You have an overinflated sense of what you are capable of, and that you will always do right by everyone  You possess a attitude that all your customers, and users who will be using your platform and tooling should just be thankful that you are so smart, and built this amazing platform for everyone to use. Please do not bother us elite with concerns privacy, security, and well being. We know best! They have things under control. However those bankers building the banking platform we depend on, people runnign those higher educational institutions, or those nasty regulators who craft the platform we are required to report business operations -- they can't be trusted! #TRANSPARENCY

Ok. Done ranting. Let me close with saying that transparency does not equal good. Transparency can be very bad, and I am not prescribing it in all situations. I am just pushing back on those who push back on me when I ask for transparency in the platforms in which I depend on for my business. Please do not try and disarm me with absolutes. I'm just asking for transparency in the business and political dealings that directly impact me as an individual, and my business. While also suggesting that more tech folks work to understand that there is a downstream impact from their world, they might not see, because of the priveleged position you enjoy, so please thinking a little more deeply as you race to get what is your.


If you think there is a link I should have listed here feel free to tweet it at me, or submit as a Github issue. Even though I do this full time, I'm still a one person show, and I miss quite a bit, and depend on my network to help me know what is going on.