Skip to main content

Ralsina.Me — Roberto Alsina's website

How I Learned to Stop Worrying and Love JSON Schema

[TOC]

Intro

This post op­er­ates on a few shared as­sump­tion­s. So, we need to ex­plic­it­ly state them, or oth­er­wise you will read things that are more or less ra­tio­nal but they will ap­pear to be garbage.

  • APIs are good
  • Many APIs are web APIs
  • Many web APIs con­sume and pro­duce JSON
  • JSON is good
  • JSON is bet­ter if you know what will be in it

So, JSON Schema is a way to in­crease the num­ber of times in your life that JSON is bet­ter in that way, there­fore mak­ing you hap­pi­er.

So, let's do a quick in­tro on JSON Schema. You can al­ways read a much longer and sure­ly bet­ter one from which I stole most ex­am­ples at Un­der­stand­ing JSON Schema. lat­er (or right now, it's your time, la­dy, I am not the boss of you).

Schemas

So, a JSON Schema de­scribes your da­ta. Here is the sim­plest schema, that match­es any­thing:

{ }

Scary, uh? Here's a more re­stric­tive one:

{
  "type": "string"
}

That means "a thing, which is a string." So this is valid: "foo" and this isn't 42 Usually, on APIs you exchange JSON objects (dictionaries for you pythonistas), so this is more like you will see in real life:

{
  "type": "object",
  "properties": {
    "street_address": { "type": "string" },
    "city": { "type": "string" },
    "state": { "type": "string" }
  },
  "required": ["street_address", "city", "state"]
}

That means "it's an ob­jec­t", that has in­side it "street_ad­dress", "c­i­ty" and "s­tate", and they are all re­quired.

Let's sup­pose that's all we need to know about schemas. Again, be­fore you ac­tu­al­ly use them in anger you need to go and read Un­der­stand­ing JSON Schema. for now just as­sume there is a thing called a JSON Schema, and that it can be used to de­fine what your da­ta is sup­posed to look like, and that it's de­fined some­thing like we saw here, in JSON. Cool?

Using schemas

Of course schemas are use­less if you don't use them. You will use them as part of the "con­trac­t" your API prom­ises to ful­fil­l. So, now you need to val­i­date things against it. For that, in python, we can use json­schema

It's pret­ty sim­ple! Here is a "ful­l" ex­am­ple.

import jsonschema

schema = {
  "type": "object",
  "properties": {
    "street_address": {"type": "string"},
    "city": {"type": "string"},
    "state": {"type": "string"},
  },
  "required": ["street_address", "city", "state"]
}

jsonschema.validate({
    "street_address": "foo",
    "city": "bar",
    "state": "foobar"
}, schema)

If the da­ta does­n't val­i­date, jsonchema will raise an ex­cep­tion, like this:

>>> jsonschema.validate({
...     "street_address": "foo",
...     "city": "bar",
... }, schema)
Traceback (most recent call last):
  File "<stdin>", line 4, in <module>
  File "jsonschema/validators.py", line 541, in validate
    cls(schema, *args, **kwargs).validate(instance)
  File "jsonschema/validators.py", line 130, in validate
    raise error
jsonschema.exceptions.ValidationError: 'state' is a required property

Failed validating 'required' in schema:
    {'properties': {'city': {'type': 'string'},
                    'state': {'type': 'string'},
                    'street_address': {'type': 'string'}},
     'required': ['street_address', 'city', 'state'],
     'type': 'object'}

On instance:
    {'city': 'bar', 'street_address': 'foo'}

Hey, that is a pret­ty nice de­scrip­tion of what is wrong with that da­ta. That is how you use a JSON schema. Now, where would you use it?

Getting value out of schemas

Schemas are use­less if not used. They are worth­less if you don't get val­ue out of us­ing them.

These are some ways they add val­ue to your code:

  • You can use them in your web app end­point, to val­i­date things.
  • You can use them in your client code, to val­i­date you are not send­ing garbage.
  • You can use a fuzzer to feed da­ta that is tech­ni­cal­ly valid to your end­point, and make sure things don't ex­plode in in­ter­est­ing ways.

But here is the most val­ue you can ex­tract of JSON schemas:

You can dis­cuss the con­tract be­tween com­po­nents in un­am­bigu­ous terms and en­force the con­tract once it's in place.

We are de­vs. We dis­cuss via branch­es, and com­ments in code re­view. JSON Schema turns a vague ar­gu­ment about doc­u­men­ta­tion in­to a fac­t-based dis­cus­sion of da­ta. And we are much, much bet­ter at do­ing the lat­ter than we are at do­ing the for­mer. Dis­cuss the con­tract­s.

Since the doc­u­ment de­scrib­ing (this part of) the con­tract is ac­tu­al­ly used as part of the API def­i­ni­tions in the code, that means the doc­u­ment can nev­er be left be­hind. Ev­ery change in the code that changes the con­tract is ob­vi­ous and re­quires an ex­plic­it rene­go­ti­a­tion. You can't break API by ac­ci­den­t, and you can't break API and hope no­body will no­tice. En­force the con­tract­s.

Fi­nal­ly, you can ver­sion the con­trac­t. Use that along with API ver­sion­ing and voilá, you know how to man­age change! Ver­sion your con­tract­s.

  • Dis­cuss your con­tracts
  • En­force your con­tracts
  • Ver­sion your con­tracts

So now you can stop wor­ry­ing and love JSON Schema as well.

Pedro Vagner / 2017-04-22 18:39:

Hi Roberto, thanks for the post. I have one question, it's possible nest schemas?

Roberto Alsina / 2017-04-22 22:18:

Yes, you can have an element in a schema that is a reference to another one, check this: https://spacetelescope.gith...

Chris Warrick / 2017-04-22 18:55:

You got back to blogging often? Great!

So, what do you think of MyPy?

Roberto Alsina / 2017-04-22 22:18:

I am looking forward to playing with it, it seems very interesting.


Contents © 2000-2024 Roberto Alsina