2023-10-13 SSL outage

Symptom

SSL error with the usual “no no no, connection is not secure” which caused MyAEGEE to be unreachable, and the login from the “OMS JC module” also to fail

Cause

Probably a mix between a deprecation of OpenSSL 1.1 and the fact that we use Traefik 1.7.x which is EoL caused a certificate not generated when it was supposed to (Traefik does automatic certificate generation through Let’s Encrypt)

Mitigation

  • ssh to the server

  • install the certbot manually

  • Stop traefik to free port 80

  • Generate certificate and copy them to traefik folder (and chown to grasshopper:developers)

  • Edit traefik.toml

    • remove acme part

    • add manual certificates

  • edit docker-compose and mount the certificates in the container

  • ???

  • PROFIT!!

Instructions for renewal

Since the initial mitigation, the steps for renewal have become simpler.

  • docker stop myaegee_traefik_1 (to free port 80)

  • certbot renew

  • copy fullchain.pem and privkey.pem from /etc/letsencrypt/live to traefik cert folder

  • docker restart myaegee_traefik_1

Notes

This should not happen again as we will migrate to MyAEGEE v2 soon, which will get rid of Ubuntu 16 (I was even afraid I could not install certbot there..!) and Traefik 1.7.x