[SSO - Keycloak] sync users and groups membership in the app

Hey!

Long time no see :slight_smile:

I’ve a topic to discuss, and I think here is the best place to do so.

Context

As you may know or not, we offer RocketChat and Nextcloud behind Keycloak to our users.
For RocketChat, we use oauth/oidc and saml for Nextcloud.

(Yes we know that RocketChat is abandonning oidc in community edition, let’s not discuss this here, open a new thread if you want to collaborate on keeping the feature in the community for future versions, we plan to work on this as well, unfortunately…)

Problem to solve

With this approach,

There are mainly 3 problems

  • group membership and user attributes are only synced at login time
  • group creation and user creation are only synced at login time
  • group and user deletion are never garbage collected on RocketChat and Nextcloud

And the main problems with these problems are:

  • gdpr compliance about user deleted data
  • security with user deleted (if a user is deleted and somebody creates the same username account…)
  • UX, as an admin, in keycloak, I create a user, a group and assign this user to the group, I expect to see this replicated in Nextcloud instantly

Solution space

are we alone?

A good question to ask, is β€œare we alone to have this issue?”

Of course not :slight_smile:

So basically during this investigation, I realized that actually ldap was used to solve this use case.

And it is already implemented in RocketChat and Nextcloud.

Ldap can be synced ~ 1h in Nextcloud and RocketChat.

And yeah, this is really nice, ldap is indeed a standard for user and group membership directories. So this is really nice solution.

Then you get the ldap for the user sync part, and oidc or saml for user login which is a lot nicer than ldap, especially with 2FA :slight_smile:

Problems with ldap

But, yeah ldap is not rainbows and unicorn.

  • one more database to store state about your user directory (ldap, keycloak, Nextcloud)
  • ldap is another database, and if you want High Availability, it is another layer of complexity
  • sync once an hour, can still be problematic in term of UX
  • sync once an hour, is not ecological for an organisation that update its directory once a month
  • we don’t want to expose user password to the app (might be possible throught conf)
  • we depend on the app implementation of ldap and oidc (or saml)

a solution without ldap?

So the requirements for a solution could be this:

  • not another database with state to store/sync
  • nice in term of UX (change is done, and replicated almost immediately)
  • nicer in term of environment
    • not deploying ldap is probably nicer in term of eco impact
    • not syncing once an hour is probably also nicer

Our proposal that we plan on working

So after some discussions internally, we came up with the following plan.

And we post it here to gather early feedback from orgs like you that can be interested.

(we’ll develop it in go, if you plan to develop it, choose your language, but I’m not here to discuss about the language that we’ll work with)

Phase 1 - entire sync - cron based

The phase 1 of the project is to have a generic sync tool, like ldap, but more generic.

The idea is to pull from a source of truth (keycloak) and push to targeted apps (nextcloud & rocketchat).

This tool can be deployed easily by any org, to sync the entire directory at the frequence they like.

And during phase 1, we’ll develop the following connectors:

  • from keycloak
  • to Nextcloud
  • to RocketChat

We’ll make it modular so if you want to develop a from ldap connector, or a to discourse connector, it shouldn’t be that much work on your side.

(Here I say from, to, but in the end, I’m not sure it is relevant, we’ll see during implementation, but indeed, we just need this)

At the end of phase 1, we have the same as ldap without having to have ldap, it will already be a nice improvement to our current situation.

Phase 2 - granular sync - event based

On a second phase, we can go further.

Keyclaok does have a way to emit events, and there is a plugin to make webhook calls from these events.

Based on the phase 1, we can develop an api around the libs used in phase1, that would sync only the user or the group that is concerned by the event.

The problem with webhooks, is that you can miss some. We could circumvent this by having a persitent queue like nats or rabbitMQ, but this adds complexity. We can also just keep the entire, cron based, sync at a much lower frequency, and it keeps complexity low, and is as resilient.

This phase 2 would be more involving for orgs that want to deploy it, but we think it is a lot nicer in term of environment and UX.

Concept

Overview

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                   β”‚     Directory
β”‚ Keycloak Provider β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                   β”‚                 β”‚   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                 └──►│        β”‚
                                          β”‚  Core  β”œβ”€β”€β”€β”€β”€β”
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                β”Œβ”€β”€β–Ίβ”‚        β”‚     β”‚
β”‚                    β”‚    Directory   β”‚   β””β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β”‚
β”‚ Nextcloud Provider β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                  β”‚
β”‚                    β”‚                                   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                                   β”‚
          β–²                                              β”‚
          β”‚                                              β”‚
          β”‚                   Changelog                  β”‚
          β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Flow

  1. Get directories from providers
  2. Optionnaly perfom pre processing
  3. Generate a changelog of the diff betwen directories
  4. Optionnaly perform post processing
  5. Patch provider with the changelog

Feedback ?

We know some librehosters that would be interested in that, and we’d love your feedback :slight_smile:

2 Likes

I’m definitely no OIDC/Oauth2 expert, but have you considered logging people out when their groups have changed? If I understand your post correctly, that should trigger a group & attributes sync once the user tries to use the app again, and needs to log in.

I believe oauth2 supports a /logout endpoint, but we have not used it yet. I also believe not every application implements it. And then of course you still haven’t solved the garbage collection problem.

And do you think the problem might be easier/less work to solve if you only support OIDC, or wouldn’t it matter? We’re using the sociallogin plugin for Nextcloud, which works fine for us so far.

I’m not an expert either.

Logging out is still not perfect. As you said, we still have GC issue. Plus, actually user creation. As long as the user didn’t log in the tool, the user is not created, hence not visible. Say you want to share a file with this newly created user, it doesn’t work.
Same goes for groups creation/deletion.

OIDC like SAML are β€œfrontend” login solution, Nextcloud doesn’t discuss directly with Keycloak, they do discuss though the user being redirected between webpages.
Basically, we need a backend solution to sync users and groups. Historically, this has been done with ldap, but it is not perfect, and I don’t understand it :slight_smile: So we are better off developing a new one :slight_smile:
(Half joking here, I still think we have better arguments than NIH classic one :slight_smile: )

Great approach to plan developing a service that is syncing users and group memberships from keycloak to its clients. Pre-population of users currently works with nextcloud being connected in addition to an LDAP user store. Pre-population is something administrators (especially if they are used to Microsoft tools) expect: you should be able to share a folder or a chat group to a user, even if s/he has not yet signed up with the specific service.

Please share any repo or where we can help to specify details.

1 Like

We got sidetracked with other projects, but we’ll start it before december, that’s for sure :slight_smile:

And yes, prepopulation is one of the main pain point :slight_smile:

1 Like

Hi @pierreozoux, this indeed seems to be a recurring problem. My first thought as I was reading your proposal was: why not use a synchronization mechanism like ActivityPub or XMPP or RabbitMQ that you mentioned? Group changes sound like a perfect example for such announcements, and the publish-subscribe pattern goes a long way solving it efficiently. If your client only changes once per month, then it’s a single synchronization step, no need to run a crontab at all, it optimizes for actual usage.

That said I completely agree with the phases: one for full synchronization and the next one for granular (on-demand) synchronization.

I’m not sure about the energy efficiency of running Keycloak vs. LDAP, but in the case of preferring Keycloak, then could this be implemented with an event broker? There are some Keycloak extensions already supporting some listeners (MQTT, RabbitMQ, even one for pubsub on the Gaggle cloud that could serve as a starter).

1 Like

Using an event broker sounds good.

According to the RC comparison table EE-vs-CE keycloak would need to publish in addition for oAuth:

  • Assign Rocket.Chat roles based on OAuth roles
  • Join channels automatically based on OAuth roles

And we need to manage leaving users.

Mapping groups from keycloak to nextcloud is already working well with the SAML member attribute. But letting go users from groups they left in the IDP seems to be triggered only at their next login, so they’d still get notifications etc, which is not ideal.

Mapping groups to RocketChat requires a mapping table when doing it with SAML. Same here, users who leave are not eliminated in the respective RC groups.

I would suggest to map all groups in an IDP to a Team in RC (feature introduced in RC 3.13) and fill / delete the members with a cron job or event based. You still can hide those teams you do not need in the left panel.

This month was proposed to start working on the generic sync tool. Any date for a kick-off?

2 Likes

Yes, we have a ceph cluster dying and moving to a minio cluster, this was a bit unexpected :slight_smile:

But yes, still planning to do it, maybe starting in december, but more probably beginning of Jan :slight_smile:

Should we plan a kick-off? How’s about 19 or 20 Jan 22 afternoon?

It’s nice to see people interested ! I’ll be working on this during the next couple of months.
We just discovered the SCIM standards (http://www.simplecloud.info/) and we plan to build upon it.
A kick-of is a great idea? 20 Jan is good for us. 14h30 CET on https://meet.liiib.re/scim ?

1 Like

Thursday 14h30 is good for fairkom folks.

There is already a nice SCIM plugin for keycloak https://github.com/Captain-P-Goldfish/scim-for-keycloak

So we need to test it and focus on the clients.

Nice !
And this one https://github.com/suvera/keycloak-scim2-storage :wink:
We’ll need app adapters which are both SCIM Service Provider (for push strategy) and SCIM Client (for pull/initial reconciliation).

We tried suvera/keycloak-scim2-storage. It only does user creation, and It’s based on a dumb loop. So we don’t think we should use this one.
As mentioned in this issue Keycloak doesn’t have internal middleware APIs to reject response in case of SCIM error.
For now, we see only 2 options for KC Client :

  1. Dev a KC proxy, wich would allow us to achieve strong consistency and realtime propagation, but is more tricky to dev because it could lead to bad beaviors or create an attack surface.
  2. Dev a KC extension or a sidecar program wich would loop or react on KC events, it would need a way of storing (posgres, k8s cm) metatadata (events start date for next loop). It’s less senstive but in case of error we just have to have a good logging system because nothing is actionable.

We lean toward option 2 : a Golang program that stores metadata in k8s config map or in a S3 bucket. If your Java dev wants to, he can build the same logic as a KC extension and maybe extend KC DB.
What do you think ?

Keycloak is moving from WildFly to Quarkus - the extensions probably won’t work anymore. We rather should aim at a sidecar maybe using REST. See their roadmap Keycloak - Search

Let’s discuss this at our next meeting on Friday Feb 4 2022 14:30 CET on https://meet.liiib.re/scim

Pad: Libre SCIM - v1.0 - HedgeDoc

Next status meeting Friday Feb 11 2022 14:30 CET on https://meet.liiib.re/scim

Here is the Keycloak Client POC I made : git.
I first build it in Kotlin to experiment with the language, and it was working. Then I tried to make it work with Keycloax.X (Quarkus), and I fiddled a lot with the dependencies which didn’t work (something with the rest client not being provided by the runtime). And I finally moved it to Java because Kotlin support in VSCodium isn’t great.
Anyway, there is now an error with JPA that I don’t quite understand. If someone has a little time to help me fix this and clean up the pom.xml, It would be much appriciated.

We got a nice demo from @hrenard to create and delete users in rocketchat with an SCIM provider in keycloak.

Next status meeting Fr 25 Feb 2022 11am on https://meet.liiib.re/scim

We added some rough issues in git to represent the work to be done. Feel free to create an account, comment or take on some of them.

Hey @rasos, I would like your opinion on a problem. I’m working on KC group to RC team support and there are some inconsistencies.
RC doesn’t allow the use of admin username so the default KC realm admin is excluded from SCIM logic. RC needs a user with the admin role, and he will probably be the same one who gives his RC API token to the SCIM app.
The same user will also be the owner of all teams created via SCIM. But KC doesn’t know who he is. And RC needs to keep him in the member list when adding or removing other users.
Here are the options I thought of, ordered by simplicity :

  • KC stays ignorant and the RC SCIM app adds the admin in the member list. (Best for the 1st implementation)
  • We add a role or an attribute on KC users which needs to auto join all groups and stays in it.
  • Upon group creation, a group admin user is created and join/stays in the group.

(Join and stays might be complicated)
Do you have specific requirements ?