Waiting for dev...

Ruby, Vim, Unix...

A Walk With JWT and Security (I): Stand Up for JWT Revocation

There is some debate about whether JWT tokens should be revoked (for example, on signing an user out) or whether, on the other side, doing so is a nonsense that breaks the primary reason why this technology exists.

JWT tokens are self-describing. They encapsulate the information that a client is trying to provide to an API, such as the id of a user that is trying to authenticate. For this reason, they are defined as stateless. The server only needs to validate that the incoming token has its own signature in order to trust it; it doesn’t have to query any other server, like a database, another API…

Self-describing tokens exist in opposition of more traditional opaque tokens. These usually are a string of characters which an API needs to check against another server to reveal if they match the one associated to a user record.

Stateless nature of a pure JWT implementation has a very important implication. A server has nothing to say about a token except whether it was signed by itself or not, so it has no way to revoke them individually (it could invalidate all issued tokens at once changing its signing key, but not a single one in isolation). This means that it is not possible to create an actual sign out request from the server side.

Even if this fact can acknowledge a stateless purism, I consider it near an abomination from the security point of view. It is true that if both client and API belong to the same team token client-side revocation can be kept under control. But server-side technologies have better tools to deal with security and has less attack vectors like, say, a web browser. If the API is consumed by third party clients, then relying on the fact that they will do the right job is completely unacceptable.

However, there is nothing in the JWT technology that prevents adding a revocation layer on top of it. In this scenario, an incoming token is verified and, like in opaque tokens, another server is reached to check if it is still valid.

At first impression it actually seems nonsense. It looks like we are ending at the same land of opaque tokens with the additional signature verification overhead. However, a closer look reveals that we gain some security benefits and that, in fact, it doesn’t exist such overhead.

Let’s examine first which security benefits we can get. When revoking a JWT token there is no need to store the whole token in the database. As it contains readable information, we can, for example, extract its jti claim, which uniquely identifies it. This is a huge advantage, because it means that stored information is completely useless for an attacker. Therefore, there is no need to hash it to accomplish a good zero-knowledge policy, and there is no need to keep a salt value for each user to protect us from rainbow table attacks.

Now about the alleged overhead that JWT with revocation would suppose. As we said, with JWT we have to take two steps: signature verification and a server query. In opaque tokens, instead, it seems we just have to query the server. But last is not true. A secure opaque token implementation should not store unencrypted tokens. Instead, it should require the client to send a kind of user uid along with the unencrypted token. The user uid would be used to fetch the user and the unencrypted token would be securely compared with the hashed one. So, this hash comparison is also a second step which, even I haven’t benchmarked it, should have a similar overhead with signature verification.

Using a standard like JWT has also some abstract benefits which are difficult to measure. For example, usually, with current libraries, you get for free an integrated expiration management through the exp claim. However, as far as I know, there is no standard for opaque tokens, which make libraries prone to reinvent the wheel every time. In general, using JWT should be more portable.

Of course, I’m not saying that JWT with revocation is always good and opaque tokens are always bad. There has been detected JWT specific attacks that good libraries should have fixed, and irresponsible use of JWT can have some dangers that we’ll examine in further posts. At the end, developers must be aware of what they are using and a secure opaque token implementation is also very valid. But adding the revocation layer on top of JWT shouldn’t be disregarded as easily. On the next post, we’ll take a look at some revocation strategies that can be implemented.

Some background in the debate about JWT security and revocation:

Ruby JWT Authentication With Warden-jwt_auth

This post is about the reasons why I decided to create warden-jwt_auth as a solution to implement JWT authentication in rack applications (that includes Rails).

What is JWT and how can it be used in Authentication?

I won’t repeat the very good information out there. Here are some relevant links:

My thoughts about the subject

In my view, if you can use cookies, go ahead and use them. They have been around for a lot of time and they are battle tested. Authentication is an essential security aspect of an application and the less you move away from the standard way the most placidly you will sleep.

So, in which situations could it be better not to use cookies? I can think of two:

  • It is not easy to share cookies between different domains. That’s not true for CORS requests, where Access-Control-Allow-Credentials header can be used to instruct the browser to accept them. But, for instance, routine GET requests let you out of the game.

  • Mobile platforms support. It is not longer true for acceptable modern APIs, but if you need legacy support you could have troubles.

You should be aware that JWT authentication could expose you to XSS attacks, while it is not less true that cookies could expose you to CSRF attacks. In both cases, the best you can do is to use modern tools and frameworks and not to try to reinvent the wheel.

So, in a project where cookies are ruled out, you still need to decide between using JWT or tokens stored in the server. In this scenario, right now I choose JWT for the same reason I chose cookies over JWT: JWT authentication is becoming the standard for token based authentication.

Tokens stored in the server have some security concerns, like expiration management, the need of being stored hashed or timing attacks. Good libraries should take care of all of this in a simple and secure way, but I would say this is not the case in Ruby world.

The DON'TS with JWT Authentication

JWT Authentication is stateless, so there are security issues, like timing attacks, that automatically does not exist for them.

However, there are still some concerns you have to keep in mind.

Two of them are just related to the way JWT is sometimes being used:

  • You must not add information that may change for an user. For instance, a lot of time people recommend adding role information about an user so that a database query can be saved. So, in your JWT code you add that alice user has 3 as id and has accountancy role. What happens if accountancy privilege is revoked for alice? That token that states the opposite is still valid (that’s the nature of JWT, tokens are always valid until they expire or server secret changes) so she could imposture herself.

  • You must not add private information unless you encrypt your tokens. Bare JWT (without encryption) is just a signed token. It means that the server perfectly knows whether he issued the incoming token or not, disallowing if not. But the information contained in the token is readable by EVERYONE (try it in ‘Debugger’ section). So I recommend just coding harmless information like the user id.

But there is still an important issue that is inherent to JWT technology itself. Tokens issued by the server are valid until one of two following conditions is met:

  • Token expiration time arrives. Libraries that decode them have built-in functionality to check the exp claim which contains that information.

  • Server secret changes. When this happens, all issued tokens are invalidated.

That means the following: there is no sense in a sign_out request for the server. Or, put in other words: the server has no way to invalidate an individual token before its expiration time. That lets frontends (the clients) with the responsibility of doing the actual signing out, destroying the token from the local storage in he case of web browsers.

When there is just one team working with both the frontend and the backend, this can be easily controlled. But, if you are developing an API that will be consumed for third parties there is no way you can be sure they will be doing the good job. Anyway, in general, I think it is unacceptable to let the clients with that security responsibility.

However, there is a workaround for this problem. You can use a blacklist and add tokens to it every time a sign out is requested. Then, for incoming tokens, you should check if they are in the blacklist before letting them in. Using a blacklist you loose the pure stateless nature of JWT, but I think not doing so is too dangerous.

Better than storing the whole token in the blacklist, you should store just its jti claim, which uniquely identifies it. Otherwise, it would be like storing plain passwords in a database. Another option would be to hash them before storing, but I think it is a bit easier the former solution.

If you are worried about a blacklist growing too much, you can always schedule a task that, each few months, changes the server secret (signing out everybody) and clears the blacklist out.

What I expected from a JWT authentication Ruby library?

When I looked for current Ruby libraries helping with JWT authentication I wanted them to have a string of conditions:

  • It should rely on Warden, a heavily tested authentication library that works for any kind of Rack application.
  • It should be easily pluggable into Rails with Devise (which uses Warden). That way, in Rails applications, I could use database authentication for the sign in action and JWT for the rest.
  • Relying on Warden and not being a full authentication system, it should be very simple to keep security audit simple, too.
  • Zero monkey patching. A lot of libraries meant to work with Devise has a lot of monkey patching. I don’t like that.
  • It should be ORM agnostic.
  • It should have a kind of built-in support to maintain a token blacklist.

I looked what it has been done so far, and here it is what I found (I summarize that no one had support for blacklists):

  • Knock. Surely it is the most serious attempt of implementing JWT authentication for Rails applications.

Ruby Gems Could Harm Your Memory

It is easy to get amazed with the wonderful world that the open source community has created around ruby gems. It seems to exist a library for everything you can imagine. Some people can even get the illusion that software development is as easy as putting together the pieces of a puzzle: just take rails, put this gem here and that one there and you are done.

It is true that it is amazing. So much work done thanks to the enthusiasm and generosity of so many people. But it is not less true that adding gems to your application is not free at all. The most important thing to consider is that you are depending on code neither you nor your team wrote, and this can get you in some troubles when it changes or when it has to cooperate with the rest of your application universe.

But in this post I want to emphasize another side effect of adding external dependencies: memory consumption. When you add that apparent innocent line to your Gemfile, you are fetching a bunch of code that may or may not be written carefully, and may or may not do much more than you need. On the other side, your server resources are not unlimited and you don’t want memory exceeded errors to appear in your logs.

Following are some tips than can help to minimize your application memory consumption related to your dependencies.

  • Only install the gems you really need. If there is a gem that does some fancy thing that is very cool for your code but it has not actual value, probably you are better off not adding it. (UPDATE 2016-09-11: As suggested by Sergey Alekseev in a comment, I point here to a list of gems with known memory issues.)

  • Add the gems you need in their appropriate group. For example, if you use rubocop as code analysis tool, you only need it in development environment:

# Gemfile

# Bad. You will add this gem to the production environment even if you are not using it there
gem 'rubocop'

# Good
gem 'rubocop', group: :development
  • If you add some gem which is only used to run some rake tasks do not require it on load time. You can require it in the Rakefile. E.g.:
# Gemfile

gem 'seed-fu', require: false

# Rakefile

require 'seed-fu'
  • If a gem scope is very well limited, you can also require it only when it is needed. For example, prawn is used to generate PDFs. If you have a class ApplicationPdf which is the base class for any other pdf, you could do something like:
# Gemfile

gem 'prawn', require: false

# application_pdf.rb

require 'prawn'

class ApplicationPdf
  # ...
  • Try to keep your gems updated. Hopefully their maintainers will keep improving them also in this aspect.

  • From time to time, monitor your gems memory consumption. For rails projects you can use derailed_benchmarks.

Of course, doing the opposite would be even crazier. Don’t reinvent the wheel in every application and serve yourself of good and battle-tested gems out there. For example, it would be a tremendous mistake to implement a custom authentication system having devise at your disposition. Just be judicious. In the end, software development is much funnier because your judgement is your best asset, and not being good putting together the pieces of a puzzle.

Rails: Rescue From All Foreign Key Violations at Once

UPDATE: With the time I see that it is clearly an anti-pattern: it may seem a good idea but it treats an exception that can arise for different reasons as it was always for one of them. So, better avoid this and control each case in your application logic.

With Rails 4.2 already out of the box, foreign key definition support comes bundled in its core. Since 4.1, the best option to manage these important database constraints was to use foreigner great gem.

Anyway, if you are using foreign keys in your Rails application, it’s easy to come across the situation where your application crashes due to an attempt to destroy a resource which id is a foreign key of another one. Trying to handle these situations on a case-by-case basis is quite painful, but with a very simple trick we can rest assured that it won’t happen. (UPDATE: Please, read comments conversation with Robert Fletcher. He pointed out some important concerns you have to keep in mind before adapting this mostly pragmatic solution.)

When ActiveRecord encounters a foreign key violation, it raises an ActiveRecord::InvalidForeignKey exception. Even if in its documentation it just says that it is raised when a record cannot be inserted or updated because it references a non-existent record, the fact is that it is also used in the case we are interested.

With that and an around filter, considering we are only deleting in destroy actions, we can just add to ApplicationController or to a controller concern:

around_action :rescue_from_fk_contraint, only: [:destroy]

def rescue_from_fk_contraint
  rescue ActiveRecord::InvalidForeignKey
    # Flash and render, render API json error... whatever

In case we need more fine grained control, we could also rescue in the model layer through a reusable concern, adding a meaningful error in each situation and letting to the controller the task to render the error, but in many cases (surely mainly REST API’s) this simple trick should be enough.

Read Your Newsbeuter Feeds in Epub Format

I love newsbeuter CLI RSS reader. When I discovered it I saw that it was what I had been looking for quite a long time. Very quick and easy to navigate through the feeds, when I find one that seems interesting enough, I press the o key and I can read it in full-colors mode in my GUI web browser.

I also love calibre recipes API. After hours working in front of my computer screen, sometimes I don’t want to read anything more in it and thanks to it I can convert any RSS feed to an epub.

However, I don’t like to do things twice, so when I find a new interesting RSS I don’t want to add it first to newsbeuter and later prepare a calibre recipe for it. So I thought it would be great to have a calibre recipe which feeds are taken directly from newsbeuter url files.

The following is a dynamic calibre recipe which feeds are taken from ~/.newsbeuter/urls. It expects the lines in the urls file to be in the format:

http://address_to_the_feed.com "~Name of the feed" tags

"~" special newsbeuter tag is intended to be the feed name. In my script it must be the first to appear after the url. If you don’t like it, it is very easily modificable.

Running ebook-convert newsbeuter.recipe .epub will generate an epub with all the feeds from newsbeuter urls file.

Furthermore, you can filter by tags overusing the tags cli option (which will introduce that tags in the meta information for the e-book generated, but in fact it is good).

ebook-convert newsbeuter.recipe .epub --tags="english,newspaper"

The CSS is taken from gutenweb, a simple typographical stylesheet I developed to have out-of-the-box styles that should be comfortable to read. Even if it won’t be perfect for all those feeds mark-up, it is quick and realistic about the amount of time I can spend tweaking a recipe style.

A disclosure. I never program in Python except for some calibre recipe. So, sure the script has a lot of things to improve. So, please, I would be happy for any feedback.

Last, the icing on the cake, with the following function in your ~/.bashrc file you can do cool things like:

epub english newspaper

to create the epub file with all the feeds with both english and newspapertags. Of course, substitute in it the path to the newsbeuter.recipe file for the one you need.

Append Issue Number to Commit Message Automatically With Git Hooks

A great feature in the integration between issue tracking systems and repository hosting services is the possibility to reference an issue from a commit message. You add the issue number and everything gets nicely linked, allowing the subsequent exploration of all commits that led to an issue resolution.

But long numbers are mainly for machines, and it is a pain having to add the issue number each time you commit something, specially if, like me, you tend to commit quite often.

This repetitive and boring task is ideal for some kind of computing automatism magic, and git hooks has a lot of it.

Git comes with the prepare-commit-msg hook, which, as its man page states:

The purpose of the hook is to edit the message file in place […]

Just taking some consideration about our workflow and our preferences we can get a lot of benefit from it. Following, I explain what works for me, but it can be easily tweaked to adjust other preferences.

As I told before, I think long numbers are for machines, so I don’t want to give the issue number a prevailing visual importance. I just want it to be there for machines to use it. So, I don’t want the number to appear on the top commit message line, but at the beginning of the body, from where I can move it only if I need to.

But from where will the number be read? Taking advantage of another best practice like feature branches are, we can name them like user_auth-87369, where everything after the dash (-) is the issue number. Again, I want it to be at the end of the string.

Then, with the following script in the .git/hooks/prepare-commit-msg file, at every commit the number will be automatically taken and introduced in the third line of the message.

So, when the editor is opened and I edit the commit message, it would look something like that:

Add authentication gem

# Please enter the commit message for your changes. Lines starting
# with '#' will be ignored, and an empty message aborts the commit.
# On branch user_auth-87369
# Changes to be committed:
#   modified:   Gemfile

The script is really simple. To understand it, first it is good to know which parameters git passes to it. Again, from the man page:

It takes one to three parameters. The first is the name of the file that contains the commit log message. The second is the source of the commit message, and can be: message (if a -m or -F option was given); template (if a -t option was given or the configuration option commit.template is set); merge (if the commit is a merge or a .git/MERGE_MSG file exists); squash (if a .git/SQUASH_MSG file exists); or commit, followed by a commit SHA-1 (if a -c, -C or –amend option was given).

So, first the script checks if $2 is set. If so, it does nothing, because it would mean that the message is comming from the command line, a merge, a template… They are situations where usually you don’t want the issue number, even maybe it is not always the case (fit your needs). Then it uses sed to extract the number from the branch name and append it to $1, which is the name of the file containing the default commit message.

With this simple script in place, - can only be used in the branch name to separate the issue number, but for me its all right because for the other cases I use _, but, as I said, it is very easily customizable.

If you want this to be present in any brand new git repository you create, take a look at the git init template directory.

Distributable and Organized Dotfiles With Homeshick and Mr

dotfiles are the personality of your computer. Without them, it would be like any other computer in the world. That’s why it is a good idea to keep them in a distributable way, so that you can import them easily when you migrate to another system.

But dotfiles have a well-know problem when it comes to make them distributable: they are all scattered in your home directory, so it makes difficult to create repositories from them.

As far as I know, there are two main tools that help us with it:

  • vcsh allows you to mantain several git repositories in one single directory (home, when talking about dotfiles). To achieve it, it uses a little amount of magic through git internals.

  • homeshick creates regular git repositories with just one special condition: they have in their root a directory named home, inside of which you’ll put your dotfiles. Later, you can tell homeshick to symlink the contents of that directory into your actual home directory. Of course, applications using your dotfiles won’t notice the difference. (Note: homeshick is based in homesick, but I prefer the former because it is written in bash, while homesick is a ruby gem, so you would need more dependencies when importing your dotfiles to a pristine system.)

It is just a matter of taste which one to choose. Personally, I prefer homeshick because its condition about the home directory. That way I can put outside of it other files that I want to keep close of my dotfiles but not mixed in my actual home directory (remember only the contents of that directory would be symlinked), like some README or scripts to install the application that will be using that dotfiles.

Installing homeshick is very easy and you can follow its homepage instructions. Repositories created with it are called castles (just a name), and working with them is also very easy. Here it is what you could do to create your vim castle:

homeshick generate vim-castle # Create a repository called vim-castle with an empty home directory inside
homeshick track vim-castle ~/.vimrc # Add .vimrc inside the home directory of your vim castle. Automatically ~/.vimrc is now a symlink
homeshick cd vim-castle # Enter castle
git commit -am 'Initial configuration' # Commit your changes
git remote add origin git@github.com:username/vim-castle.git # Add a remote repository
git push # Push your changes

Now we are able to create repositories from our dotfiles, keep track of our configuration changes and push them to a save place from where we will be able to pull when we need it. How would we recover them in another machine? Easy:

homeshick clone git@github.com:username/castle-vim.git # When starting from scratch
homeshick pull vim-castle # To update it with the remote changes

So far so good. But we use vim, tmux, tmuxinator, zsh, newsbeuter, mutt… a lot of dotfiles, a lot of castles, a little mess… Why don’t we create a one single castle with all of our dotfiles? For some people it can be a reasonable option, but, in general, having them organized has some advantages:

  • You can keep different configurations for the same application. A ssh at home and another at work.

  • You can keep public the dotfiles you would like to share with the community (vim) and private the ones you don’t want to (mutt).

  • You can pick which castles you would like to recover. Maybe you don’t want newsbeuter at work.

Here it is when the other star comes in. It is myrepos, a tool that allows you to work with a lot of repositories at once. With it, you can push, pull, commit and other operations at the same time in a set of registered repositories.

Installing it is again very easy. It has a self-contained mr executable which only dependency is perl. You can have more details in its homepage. Once done, you can run mr help to know about the bunch of magic you can do with it.

Let’s see a possible workflow for our dotfiles. Imagine we have just two castles, vim-castle and tmux-castle. First, mr needs that repositories to exist in your filesystem and to already have a remote registered.

homeshick cd vim-castle # Enter your vim castle
mr register # Register it to mr
homeshick cd tmux-castle # Enter your tmux castle
mr register # Register it to mr

Once done the above, you should have a ~/.mrconfig file with something like the following:

checkout = git clone 'git@github.com:username/vim-castle.git' 'vim-castle'

checkout = git clone 'git@github.com:username/tmux-castle.git' 'tmux-castle'

Between the square brackets [] there are the local filesystem locations for the repositories (relative to home; the source used in he example is the default homeshick location), and the value for the checkout option is the command that mr will run to checkout your repositories.

Then, when you migrate to a new system, you just have to get your .mrconfig back (so, it is a good idea to build another castle with it) and run:

mr checkout # Checkout all the repositories in one single command 

Or, if you prefer, you could also run mr bootstrap <url> to allow getting the mrconfig file from an external URL.

With any of the above commands, you have recovered all you castles without pain, and now you just have to create the symbolic links in your home directory:

homeshick link # Create symlinks from all your castle files in the actual home directory

With mr the workflow is really easy. You can run mr push to update all remotes at once, mr commit -m 'message' to commit all the changes you have been doing in different castles…

Another very interesting option is to use its hooks to run scripts, for example, after checking out a castle to install the application using that castle, or simply to prepare scripts that setup other aspects of our system.

Having this bit of discipline with your dotfiles is highly rewarding. This way you can keep synchronized different systems where you work and, also, the next time you have to migrate to a new system you will only need the almost ubiquitous dependencies git, bash and perl to feel again at home.

Stop and Restart Easily a Rails Server

I like to start rails server as a daemon with rails server -d, because I don’t want a terminal window or a tmux pane just hanging around with the server foreground process.

But each time a new gem is added to the application or some changes are made in some configuration file, it is a pain to manually look for the pid in tmp/pids/server.pid, kill it and start again the server.

I wished if I could have convenient rails start, rails stop and rails restart commands to help me in the workflow. And it is very easy to get.

Bellow you’ll find a simple bash (or zsh) function that should be added to your .bashrc (or .zshrc). Once done, you’ll be able to do things like the following:

rails start # To start the server in development environment
rails start production # To start the server in production environment
rails stop # To stop the server
rails stop -9 # To stop the server sending -9 kill signal
rails restart # To restart the server in development environment
rails restart production # To restart the server in production environment
rails whatever # Will send the call to original rails command

Here it is the function:

You see, the code has not too much mystery. Inside the else block, command is a bash built-in command which executes the given command (in this case rails), ignoring any shell function named as it (our function). So, we use it to default to original rails executable when the first argument is not start, stop or restart, passing any other argument with #@.

If you don’t feel right overwriting rails, just change the function name and remove the else block.