Data cleansing just became a lot easier

Novell has released a first version of Novell Enforcer. They blogged about this earlier. The tool supports the process of data cleansing and control in three different phases:

  • In the first phase, dubbed “Analysis“, the tool gives you a deeper insight on the quality, content and structure of the identity data in the various repositories.
  • The next phase, dubbed “Enhance“, helps you to correct erroneous data, create a consistent structure and overall enhance the quality of the identity repositories.
  • The last phase, dubbed “Control“, aids you in creating the necessary policies in the Novell Identity Manager product to keep the data consistent and clean.

This looks like a great tool and I can’t wait to lay my hands on this little gem! Up until now data control and cleansing had been a one-time job mostly implemented using a battery of quick and small scripts.

Today, Novell has shown us that data cleansing and control is not a one-time step but a continuous process that deserves a front row seat in your I&AM architecture.

Online Reputation

I just learned about Opinity, an on-line personal reputation services company. From what I could learn, without joining, it seems like a site where you can aggregate your profiles and, more importantly, reputation you have created and cultivated on other sites. I am not entirely sure how it works. Anyone who knows more? I am very reluctant to join and give them personal information before I know more. Just like they want me to build up a reputation, I will wait until they have a reputation strong enough to convince me.

On the bright side, they have this big sign in the upper right corner indicating they support Infocards and OpenID!

I have been on the inactive-side of the blogging spectrum lately. I do have two draft posts in the queue however, one about digital identities (again) and one about how garbage can ruin your day.

Do you really think you are anonymous?

There is some debating going on in the Identity community about anonymity. See here, here, here and here. Today I came across this post from Eric Norlin which I found very enlightening for me. More specifically this paragraph really got my attention:

Every transaction in the real-world involves not only explicit identification (ATM cards, credit cards, driver’s licenses, or the proxy of cash), but also implicit identification. By implicit identification, I mean the subtle body language and sociological clues that all persons engaged in transactions use (both consciously and subconsciously.) There is not a waitress or convenience store clerk on the planet that will not begin “identifying” the ability of a customer to live up the implicit social contract of commerce based upon their attributes (appearance, cleanliness, socially accepted standards of behavior, etc). This is not the real-world as we’d like it to be. This is the real-world as it is.

At first, I believed you could easily be anonymous in the real world. Imagine, if you walk across town, you never have to identify yourself. Isn’t that a perfect example of anonymity? Turns out it isn’t! Even when you do not identify yourself (using some kind of id card for instance), people can see you, remember you. Next time you walk by, they might even recognize you. You are not anonymous anymore. They might not have much information about you, but they will still be able to identify you as “that guy that passes by around noon each friday”. As long as you cannot prevent one encounter on the street to be linked or correlated with a previous encounter, you are not anynomous.

Eric’s asks the right question:

All of that nasty, real-world talk aside — the question now becomes: Should the online world reflect the real-world, or not?

My first answer would be: no. In some cases (actually, in a lot of cases) I would prefer a level of anonymity that is stronger compared to what I would normally get in the real world. I believe we can achieve this with the right technology. But keep in mind that it will not be easy, as explained by Ben Laurie:

That’s why you need to have anonymity as your bottom layer, on which you build whatever level of privacy you can sustain; remember that until physical onion routing becomes commonplace you give the game away as soon as you order physical goods online, and there are many other ways to make yourself linkable.

Thanks to Infocard and similar technologies, we can achieve some level of anonymity, but as soon as we have to enter our home address to get the physical goods, all anonymity is lost.

Anonymity and privacy are interesting subjects and, in my personal opinion, are part of the foundation of any Internet meta identity system.

They know what you did last summer.

Some days ago AOL, or at least a team within, decided to release the search dataof more then 650,000 users. They did replace actual user names with random numbers. Using those numbers you could still track all the search terms of a single user.

Then this announcement came: “A Face Is Exposed for AOL Searcher No. 441774“. By using the search terms they were able to narrow down to a single person.

This makes us wonder how much information we are leaving behind, even anonymously, that allows others to uniquely identify us.

Recent attempts at creating an Internet Meta Identity System (see Infocards and others) do include the possibility to identify yourself more anonymously (for instance being able to prove you are over 21 years) without revealing your identity. However, most sites will still enable tracking cookies. So, over time, they might be able to identify you.

For some reasons I am afraid it is impossible to design a system in which none of the participating parties, except the user, can accumulate enough information to uniquely identify someone. Knowing this, should be part of the user education. Upcoming meta identity systems will enable a smoother and more powerful experience for both the end user and relying parties but it will not completely protect your identity or privacy. You just leave too much footprints behind to ensure that.

While writing this, I saw this Google announcement telling they will still store search data, despite the potential privacy concerns.

Are you human?

Most of you know these images with letters and numbers embedded, garbled just enough to make it very hard for a computer to recognize them but not garbled enough for a human. These images are called captchas. For some reasons I always have problems with these images. They are an attempt to make me prove I am human but in reality they want me to proof I am a superhuman. Today I registered at digg.com and they use a captcha. It took me 3 tries before I got the image right!

If there would be a benefit to an identity meta-system and Cardspace, it would certainly be the demise of these captcha images.

Using infocards I can prove I am a human, a claim which can be backed by a trusted, third-party identity provider. Not only would I be able to prove that, I also wouldn’t have to come up with yet another username and password on digg.com.

I can’t waith for this identity meta-system to materialise!

Identity Silos Forever?

Lately there has been a heated discussion in the identity community about identity silos. Google’s announcement of the Google Authentication Services stirred up the fire considerably.

Ben Laurie has added a new episode in his latest blog posting “Comparing Apples and Apples: Microsoft and Google Authentication“:

The end result of the blog deathmatch between me, Kim, Eric and Dick was a deathly silence on what I consider to be the core issue.

OK, its nice that Microsoft are developing identity management software that might not suck (but remember, it still doesn’t satisfy my Laws of Identity) but the question that’s being posed about Google applies equally to Microsoft, and, indeed, anyone else with an identity silo.

So, here’s the question: is Microsoft going to accept third party authentication for access to Microsoft properties?

I would add a question to this: if breaking down the barriers around identity silos is the primary goal, would Microsoft ever give up being an identity provider? Would Microsoft hand over passport.com to a non-profit and free organisation before turning it into an Infocard provider?

Will, with the arrival of Infocards and friends, Identity Silos disappear? Or will they remain as powerful and impenetrable as before?

Coffee-shop loyalty card, an identity?

In this blogpost I was explaining why I didn’t think the example of a coffee-shop loyalty card as an identity was very good. Pat Patterson was so kind to comment, explaining why such a card was indeed an identity. After some thought I think we are thinking about two different things. In other words, it comes down to the definition of a coffee-shop loyalty card.

I started by:

… a coffee-shop loyalty card is used as an example of a card-based identity. That confuses me. Assuming, from the example, that the card does not contain any personal information like name or address, how can it be seen as a card-based identity? The only connection this loyalty card has with a person (identity) is that it is carried around by one. But that would make a lot of items suddenly card-based identities. The card cannot be used to identify or authenticate a person and has only value to the person carrying it around but it is in no way connected to that person. Following this reasoning, the 10 EURO note I am carrying around also is a card-based identity.

Then Pat commented

The coffee-shop loyalty card is an identity. The coffee-shop can build a profile of your purchasing habits over time. Sure, it’s identified as ‘74382432′ rather than ‘Joe Schmoe’, but it’s still your coffee habit.

And it’s easy for the coffee shop to link ‘74382432′ to ‘Joe Schmoe’ – they could encourage you to register your card online in return for free coffee; alternatively, they can just read your name off your credit/debit card the first time you use it to pay for coffee…

If the loyalty card points to me, either direct or indirect, it is representing an identity. However, I started with the assumption (although I admit that this was not very clear from the earlier post) that the loyalty card did not contain any such information.

If you say that a coffee-shop loyalty card is an identity, whose identity are you referring to? In case the card is completely anonymous (no number, no name …) it is the identity of whoever is carrying the card around at that time. If I drop the card on the floor then the person who finds it will get the benefits associated with the card.

So, depending on what information the card contains, it can point to nobody (or should I say everyone) or to a very specific person. I personally wouldn’t call the card an identity when it is pointing to nobody.

Is there an identity silo paradox?

When reading Eric Norlin’s latest blog, I was very pleased to see that he got the point I was making on the Identity Workshop Google group:

Put simply the identity silo paradox is this: The largest sites on the internet have built silos (some ever-deepening) of identity information. Simultaneously, the “identirati” have been working on standards and methods that are based on the premise of opening up those silos, yet (paradox coming) the large sites currently have no valid business reason for doing so. Why would eBay open up their reputation system? Why would Google allow you to use a Yahoo! credential to login to their systems?

Today we have identity silos that we think are interoperable because of missing technology to glue them together. That is one of the reasons why the “identirati” have embarked on a quest to create standards and methods targeted at opening these silos. But is a lack of the right technology really the only reason?

Even when Google, Microsoft and Yahoo! would use the same technology, I doubt they will ever enable interoperability. All of them are not only a provider of identities, they are also a provider of services and they make profit on the services, not the identities. Being able to hold a tight grip on the identities enables them to have a hollistic approach for branding their ecosystem. Think about it, Microsoft Cardspace will extend the metaphore of a “card” to a real graphical representation of it. Do you think Yahoo! will not take the opportunity to get Yahoo! branding all over their cards? If a user would login to a Google service using a Yahoo! card, that would to the user almost feel as if they are using a Yahoo! service!

Will this change over time? Probably. As Eric points out, the forces on the web will eventually lead to more interoperability and not only on the technology front. But today, identity silos have no business reason to break down the walls and accept identities from elsewhere.

Ben Laurie from Google (and Apache SSL fame) said the following:

Where does Microsoft’s work on Infocard or Live ID or whatever-the-passport-nom-de-jour is show that Microsoft has any intention whatsoever of opening their silo? What it shows is that they think everyone else should open their silo.

To me, Ben is right on target with that remark. So it seems like we are heading towards the same identity silos but walled for different reasons.

On the bright side, at least on Vista, users will finally have a consistent and secure experience when dealing with identities thanks to Infocards and more specifically Microsoft’s Cardspace. That alone is worth the effort.

Address-based identities versus Card-based identities

I was reading this blog entry about address-based identities versus card-based identities. I am still thinking about the contents and will post some more thoughts about that blog in the next few days. There was however one example in the blog I would like to comment on right away:

Whatsmore, both address-based identity and card-based identity can be further classified in some very helpful ways:

  • Address-based identities can be broken into resolvable and non-resolvable. While an address-based identity is always unique in the address space in which it is assigned, that doesn’t necessarily mean it can be resolved, i.e., dereferenced via a mechanism or protocol that provides further discover or communications with a digital subject. An email address is a good example of the former; a browser cookie a good example of the latter.
  • Card-based identities can be broken into addressable and non-addressable. This means that some card-based identities may contain an address-based identity and some may not. A business card is the classic example of an addressible card-based identity; in fact the primary purpose of most business cards is to share address-based identities. On the other hand a coffee-shop loyalty card is a good example of a non-addressable card-based identity: while it describes identity-related attributes of its owner (how many cups of coffee they have purchased), it may not contain any address-based identity whatsoever (not even your real-world name).

In the second bullet a coffee-shop loyalty card is used as an example of a card-based identity. That confuses me. Assuming, from the example, that the card does not contain any personal information like name or address, how can it be seen as a card-based identity? The only connection this loyalty card has with a person (identity) is that it is carried around by one. But that would make a lot of items suddenly card-based identities. The card cannot be used to identify or authenticate a person and has only value to the person carrying it around but it is in no way connected to that person. Following this reasoning, the 10 EURO note I am carrying around also is a card-based identity.