High level makes you fall deep

There are two ways to omit details in a model, to introduce abstraction:

  1. Omitting details you know exist but don’t need now (abstraction on purpose)
  2. Omitting details you don’t know that exist (abstraction by accident)

As you can see from the phrasing used, the second type is a dangerous kind of abstraction. If you don’t know the details you left out, you don’t know if you got the abstraction right or not.

You’ll often be in a situation where you just don’t know enough (yet) about the system to be ensure you got the abstraction right or not. In such cases, you are carrying an unknown variable with you in your design process.

That variable should eventually become (sufficiently) known. Once it does, you can revisit earlier made abstractions and verify if they were justified or not and make adjustments where applicable and required.

In any case. abstracting details away when you are not sufficiently aware of what you are abstracting away is a dangerous road to take. It’s the architectural equivalent of proving that 1 = 0 using some sneaky (but wrong) trick.

I often say that “high level makes you fall deep” meaning that if you make high level diagrams (omitting a lot of details) without being aware of the abstractions you are exposing yourself to considerable risk later on in the design and realisation phase.

Precise or flexible information models? Do we need to find a balance?

Recently I heard someone make the statement that you need in describing a system using models, you need to find the proper balance between precise models and flexible models.

In this context, precise models would be models that are correct, are as complete as possiblem flexible models would be models that are less correct but are capable of conveying information to stakeholders (mostly business people).

I would dare to state that above is not correct.

The chapter on viewpoints in the ArchiMate specification, makes a distinction between three types of viewpoints (models):

Deciding Viewpoint. Models in this category are meant for people who need to make decision, decide on trade-offs. They typicall compare several alternatives based on a list of criteria.

Informing Viewpoint. Models in this category are meant for people who need to get or give information on the system. They must be adapted to whatever area of interest and level of interest the audience has.

Designing Viewpoint. Models in this category are meant for people who have to further design and build the system. These people need completeness and precision.

So back to our inital question: do we need to find a balance between precision and flexibility in models describing a system? I say: no. You need precise models and you need flexible models. It depends on what they are used for (deciding, informing, designing) and who they are intended for.

If you insist on making only one set of models, for all stakeholders and for all purposes, there is no end to the amount of damage can you do. If you find yourself deciding on ‘the right balance’ in a model, take a step back and see if you are not trying to model the system using a one-size-fits-all model.

Conceptual – Logical – Physical

One would assume that a speaker on a Data Management Conference got the distinction between conceptual, logical and physical right? Well … no :)

It’s very common to define these terms based on the level of detail they contain. There are a few things wrong with this definition:

First, it completely ignores what we know about idealization. Idealization is the idea that defines terms like conceptual, logical and physical. People familiar with the Zachman framework will recognize this.

Secondly, it’s a very ambigious defition. What if I add a small detail to a conceptual model, did it just became logical? If not, what if I add this other small detail?

These are the definitions I use, based on idealization:

Conceptual Model. A model describing a system where there is complete abstraction of any implementation with an information system.

Logical Model. A model describing a system where we don’t abstract away a realisation with an information system, meaning we have to make design choices, but we still abstract away the technology used.

Physical Model. A model where neither the fact it is implemented using a particular information system nor the technology used, is abstracted away.

You can still have ‘very detailed’ or ‘high level’ if you want, since that is ortogonal to the above. You can have a high level physical model or a very detail conceptual model.

Snowflakes are hard to replace

When you want to hire someone, you are looking for some particular skills. I call these the primary skills. Any person you hire, any person in your team, also comes with a set of secondary skills. Secondary skills are skills you weren’t specifically looking for but you got them anyway.

If you are looking for a Java developer, you are looking for someone with ‘Java development’ as a primary skill. The person you hire, might also have experience in C# development, project management or Linux system administration. Those are the secondary skills you got even though you didn’t ask for them.

In my experience, it’s dangerous to structurally use those secondary skills in your team. Sure, when an immediate fire has to be dosed, being able to call on these secondary skill might be a life safer, but if you start depending on them structurally, you are putting yourself at some serious risk.

What is the risk? We have those skills, it would be a shame not to use them!

Well, let’s assume you have been using those secondary skills on a structural basis: one of your Java developers is not only developing code but also has been doing some of the project management and is administrating the local build and version control Linux server.

One day, that Java developer comes to your office and says he is leaving for a new job. Or perhaps something less permanent: you get a call telling that your Java developer has an accident and won’t be able to come to work for 4 to 6 weeks.

That is when you start having an issue. You find yourself on the market to find a (temporary) replacement that can develop in Java, has project management skills and knows how to administer a build and version control Linux server. You are looking for a very specific snowflake. Good luck finding that!

That is the risk of structurally using the secondary skills. You are creating unique snowflakes in your team that are very hard to replace. The only way to avoid this risk, is not to use the secondary skills available in your team. That might sound counter intuitive to some people: not using skills that are present in your team while you need them. Yet, it is the safest way to avoid having to replace a snowflake 🙂

Buzzword Bingo and You Are Not So Smart

We all know the idea of Buzzword Bingo. You create a matrix filled with so-called buzzwords and when, during a meeting or speech, you hear one of the words, you tick it of. When you reach a certain number of ticked words or have filled a row or column, you call out “Bingo” or “Bullshit”.

When you look at common buzzword bingo matrices, the words actually seem pretty normal. So what is so “bullshit” about them?

When a person uses a word, it can be in one of two circumstances:

  1. The person is familiar with the subject
  2. The person is not (sufficiently) familiar with the subject

Case 2 is an example of Cargo Cult, where someone mimics, without understanding why, the behaviour (the use of certain words) of people they see as successful, hoping to achieve some of that success as well.

The listener can be in the same two situations as the speaker. Since the proper or improper use of a word is in the eyes of the beholder, or better said, the ears of the listeners, we end up with the following four situations:

  1. The person is familiar with the subject and finds the word is used properly
  2. The person is familiar with the subject and finds the word is used improperly
  3. The person is not (sufficiently) familiar with the subject and considers the word improperly used
  4. The person is not (sufficiently) familiar with the subject and considers the word properly used

If we put these two lists in a matrix, we get 8 different combinations.

Buzzword Bingo

Of these 8, there are 3 combinations that could spark a Buzzword Bingo: combinations #3, #4, #7 and #8. But only 1 of them is a genuine Buzzword Bingo where the listener does not embarrass him- or herself: combination #4.

In combination #3, it should be a constructive debate on the use of the word. Both people are familiar with the subject and could experience a valuable learning opportunity. In combination #7, the listener would only embarrass him- or herself. Finally, combination #8 is the saddest case, both parties are not familiar with the subject and both embarrass themselves.

It should already be clear that playing Buzzword Bingo is a dangerous game, one in which you are more likely to ridicule yourself than anything else.

Sadly, it is even worse. As we learn from David McRaney in his book You Are Not So Smart, people have a strong tendency to think they are smarter than the average. They think they are in combinations #3 or #4 when in reality they are in combinations #7 or #8. That’s why people love to play Buzzword Bingo, they honestly think they know better, they honestly think they are right and the speaker is a clueless person.

Reality is sobering: you are as stupid, or as smart, as the person you are listening to. So if he is using a term you think is a useless buzzword, it is probably as much you who doesn’t understand the word as it is him performing a ritual in his cargo cult.

Recap of European Identity & Cloud Conference 2013

The 2013 edition of the European Identity & Cloud Conference just finished. As always KuppingerCole Analysts has created a great industry conference and I am glad I was part of it this year. To relive the conference you can search for the tag #EIC13 on Twitter.

KuppingerCole manages each time to get all the Identity thought leaders together which makes the conference so valuable. You know you’ll be participating in some of the best conversations on Identity and Cloud related topics when people like Dave Kearns, Doc Searls, Paul Madsen, Kim Cameron, Craig Burton … are present. It’s a clear sign that KuppingerCole has grown into the international source for Identity related topics if you know that some of these thought leaders are employed by KuppingerCole themselves.

Throughout the conference a few topics kept popping up making them the ‘hot topics’ of 2013. These topics represent what you should keep in mind when dealing with Identity in the coming years:

XACML and SAML are ‘too complicated’

It seems that after the announced death of XACML everyone felt liberated and dared to talk. Many people find XAMCL too complicated. Soon SAML joined the club of ‘too complicated’. The source of the complexity was identified as XML, SOAP and satellite standards like WS-Security.

There is a reason protocols like OAuth, which stays far away from XML and family, have so rapidly gained so much followers. REST and JSON have become ‘sine qua none’ for Internet standards.

There is an ongoing effort for a REST/JSON profile for XACML. It’s not finished, let alone adopted, so we will have to wait and see what it gives.

That reminds me of a quote from Craig Burton during the conference:

Once a developer is bitten by the bug of simplicity, it’s hard to stop him.

It sheds some light on the (huge) success of OAuth and other Web 2.0 API’s. It also looks like a developer cannot be easily bitten by the bug of complexity. Developers must see serious rewards before they are willing to jump into complexity.

OAuth 2.0 has become the de-facto standard

Everyone declared OAuth 2.0, and it’s cousin OpenID Connect, to be the de facto Internet standard for federated authentication.

Why? Because it’s simple, even a mediocre developer who hasn’t seen anything but bad PHP is capable of using it. Try to achieve that with SAML. Of course, that doesn’t mean it’s not without problems. OAuth uses Bearer tokens that are not well understood by everyone which leads to some often seen security issues in the use of OAuth. On the other hand, given the complexity of SAML, do we really think everyone would use it as it should be used, avoiding security issues? Yes, indeed …

API Economy

A lot of talk about the ‘API Economy’. There are literally thousands and thousands of publicly available APIs (named “Open APIs”) and magnitudes more of hidden APIs (named “Dark APIs”) on the web. It has become so big and pervasive that it has become an ecosystem of its own.

New products and cloud services are being created around this phenomena. It’s not just about exposing a REST/JSON interface to your date. You need a whole infrastructure: throttling services, authentication, authorization, perhaps even an app store.

It’s also clear that developers once more become an important group. There is nu use to an Open API if nobody can or is willing to use it. Companies that depend on the use of their Open API suddenly see a whole new type of customer: developers. Having a good Developer API Portal is a key success factor.

Context for AuthN and AuthZ

Manye keynote and presentations referred to the need for authn and authz to become ‘contextual’. It was not entirely sure what was meant with that, nobody could give a clear picture. No idea what kind of technology or new standards it will require. But it was all agreed this was what we should be going to 😉

Obviously, the more information we can take into account when performing authn or authz, the better the result will be. Authz decisions that take present and past into account and not just whatever is directly related to the request, can produce a much more precise answer. In theory that is …

The problem with this is that computers are notoriously bad at anything that is not rule based. Once you move up the chain and starting including the context, next the past (heuristics) and ending at principles, computers are giving up pretty fast.

Of course, nothing keeps you from defining more rules that take contextual factors into account. But I would hardly call that ‘contextual’ authz. That’s just plain RuBAC with more PIPs available. It only becomes interesting if the authz engine is smart in itself and can decide, without hard wiring the logic in rules, which elements of the context are relevant and which aren’t. But as I said, computers are absolutely not good at that. They’ll look at us in despair and beg for rules, rules they can easily execute, millions at a time if needed.

The last day there was a presentation on RiskBAC or Risk Based Access Control. This is situated in the same domain of contextual authz. It’s something that would solve a lot but I would be surprised to see it anytime soon.

Don’t forget, the first thing computers do with anything we throw at them, is turning it into numbers. Numbers they can add and compare. So risks will be turned into numbers using rules we gave to computers and we all know what happens if we, humans, forgot to include a rule.

Graph Stores for identities

People got all excited by Graph Stores for identity management. Spurred by the interest in NoSQL and Windows Azure Active Directory Graph, people saw it as a much better way to store identities.

I can only applaud the refocus on relations when dealing with identity. It’s what I have been saying for almost 10 years now: Identities are the manifestations of relationship between two parties. I had some interesting conversations with people at the conference about this and it gave me some new ideas. I plan to pour some of those into a couple of blog articles. Keep on eye on this site.

The graph stores themselves are a rather new topic for me so I can’t give more details or opinions. I suggest you hop over to that Windows Azure URL and give it a read. Don’t forget that ForgeRock  already had a REST/JSON API on top of their directory and IDM components.

Life Management Platforms

Finally there was an entire separate track on Life Management Platforms. It took me a while to understanding what it was all about. Once I found out it was related to the VRM project of Doc Searls, it became more clear.

Since this recap is almost getting longer than the actual conference, I’ll hand the stage to Martin Kuppinger and let him explain Life Management Platforms.

That was the 2013 edition of the European Identity & Cloud Conference for me. It was a great time and even though I haven’t even gotten home yet, I already intend to be there as well next year.

Conceptual, Logical and Physical

In his article “ArchiMate from a data modelling perspectiveBas van Gils from BiZZdesign talks about the difference between conceptual, logical and physical levels of abstraction. This distinction is very often used in (enterprise) IT architecture but is often also poorly understood, defined or applied.

Bas refers to the TOGAF/IAF definitions:

TOGAF seems to follow the interpretation close to Capgemini’s IAF where conceptual is about “what”, logical is about “how” and physical is about “with what”. In that case, conceptual/logical appears to map on the architecture level, whereas physical seems to map on the design/ implementation level. All three are somewhat in line but in practice we still see people mix-and-match between abstraction levels.

I am not a fan of the above. It is one of those definitions that tries to explain a concept by using specific words in the hope to evoke a shared emotion. Needless to say, this type of definition is at the heart of many open ended and often very emotional online discussions.

Conceptual, logical and physical are most often related to the idealization – realization spectrum of abstraction. This spectrum abstracts ‘things’ by removing elements relating to the realization of the ‘thing’. Opposite, the spectrum elaborates ‘things’ by adding elements related to a specific realization. You can say that a conceptual model contains less elements related to a realization compared to a logical model. You can also say that a physical model contains more elements related to a realization when compared to a logical model.

In other words, conceptual, logical and physical are relative to each other. They don’t point to a specific abstraction. For that you need to specify more information on exactly what kind of elements of realizations you want to abstract away at each level of abstraction.

The most commonly used reference model for using these three levels is as follows:

  • Conceptual. All elements related to an implementation with an Information System are abstracted away.
  • Logical. A realization with an Information System is not abstracted away anymore. All elements related to a technical implementation of this Information System are abstracted away.
  • Physical. A technical realization is assumed and not abstracted away anymore.

That is the only way to define the levels conceptual, logical and physical: define what type of realization-related elements are abstracted away at each level. You can never assume everyone uses the same reference model. You either pick an existing one (e.g. Zachman Framework) or define your own.

Saying that conceptual is “what”, logical is “how” and physical is “with what” is confusing to say the least. Especially if you know that in the Zachman Framework “how” and “what” are even orthogonal to “conceptual” and “logical”.

At first it is not easy to define a conceptual model without referring to an Information System. For instance any referral to lists, reports or querying assumes an Information System and is in fact already at the logical model.

A misunderstanding I often hear is that people assume that conceptual means (a lot) less detail compared to logical. That’s not true. A conceptual model can consist of as many models and pages of text as a logical model. In reality, conceptual models are often more limited but I only have to point to the many failed IT projects due to too little detail at the conceptual model. It’s just wrong.

Smart Meters … but not so secure

In this article Martin Kuppinger from KuppingerCole Analysts discusses a security leak in a device used for controlling heating systems.

It’s shocking but I am not surprised. IT history is riddled with cases of devices, protocols and standards that required solid security but failed. Mostly they failed because people thought they didn’t need experts to build in security. Probably the most common failure in IT security: thinking you don’t need experts.

Who remembers WEP or even S/MIME, PKCS#7, MOSS, PEM, PGP and even XML?

The last link shows how simple sign & encrypt is not a fail safe solution:

Simple Sign & Encrypt, by itself, is not very secure. Cryptographers know this well, but application programmers and standards authors still tend to put too much trust in simple Sign-and-Encrypt.

The moral of the story is: unless you really are an IT security expert, never ever design security solutions yourself. Stick to well known solutions, preferably in tested and proven libraries or products. Even then, I strongly encourage you to consult an expert, it’s just too easy to naively apply the, otherwise good, solution in the wrong way.

Better Architecture For Electronic Invoices

Introduction

As an independent consultant I operate from my own one-man company. That means I am participating in the world of invoicing. In the last couple of years I have seen a steady rise in electronic invoices. Before electronic invoices I received all my invoices in my letter box and believe me, compared to electronic invoices, that was easy and cheap. In short, I don’t like electronic invoices.

The most important downside is that they don’t arrive in my letter box. Most providers of electronic invoices offer me a portal where I can login and download the invoice. Of course, every single one of these portals has its own registration system, its own credentials and its own password polices. Needless to say, it’s very cumbersome to fetch invoices. Most of the time I end up ‘logging in’ through the ‘password forgotten’ use case. I don’t even bother to change the generated password since next time I’ll be logging in through the ‘password forgotten’ use case anyway.

After taking the first hurdle, logging in to the website, I am greeted by the second one: user interface and functionality. Some look nice, some look like a website that would have been great in 1994 but not so anymore in 2013. Tastes differ so I’ll assume it’s me. But in terms of functionality, some of these portals seem to make it a sport to provide the worst possible user experience in finding and downloading invoices.

They also differ greatly in what kind of functionality they offer. There are portals that offer me an online archive (although there is no mention whatsoever about any long term guarantees of the archive existence). Others encourage me to download the invoice as soon as possible since they only offer access for a limited amount of time. Hint: since not all of them offer archive functionality and since I can’t afford to have an archive of invoices scattered all over the internet (with varying service level agreements), I can’t make use of those archives anyway. Whatever functionality they offer, in my world they are just temporary inboxes: I download the invoices and archive them locally as soon as I can after which I forget about the copy on the portal.

All sources of electronic invoices notify me through email when a new invoice is available through the portal. One source even is so kind as to include a direct download link in the email. Convenient as it may seem (and it really is) it’s also a security risk: the direct download link does not require any form of authentication. It acts like a ‘bearer token’ with a long life span, a token that is transmitted in clear text as part of the email and can be intercepted.

Current Architecture

All this is the consequence of one-sided view on the problem by publishers of electronic invoices. If you look at the challenge of issuing invoices from the point of view of a company, the architecture below is an obvious solution. Invoices are generated internally in a digital format, they are distributed using a portal and the company can even track which invoices have been accessed and which haven’t.

Electronic Invoices - 1

If you take the point of view of a customer and assume the same architecture as above, the picture looks different and a lot less shiny. We see that a customer is faced with dozens of point-to-point access paths to invoices, all of which are implemented differently and act differently.

Electronic Invoices - 2

Better Architecture?

As an architect I feel this can be done better, a lot better in fact. In my proposal for an architecture, there are four roles involved in getting electronic invoices from a publisher to a customer: (1) Customer, (2) Collector, (3) Distributor and (4) Publisher.

Electronic Invoices - 3

The primary function of a Collector is to collect all invoices for specific consumers and make them available to those consumers. The role of the Distributor is to collect invoices from publishers and route them to the correct Collector so they end up at the correct consumer. The main goal of the Collector and Distributor is to cater for their respective customers: Customers and Publishers. That is a significant difference from the current architecture: both stakeholders (Consumers and Publishers) have a party who will act in their best interest as catering them is their reason for existence. In the current architecture no party exists which will do their best to cater the Customer. Of course, Publishers may say they do when they offer their portals but reality is that it’s not their primary reason for existence and the above mentioned problems prove that they only go so far in providing services for the customer.

In my specific case, I could opt for a Collector service that prints and mails the invoices to me while at the same time keeping an electronic archive for me.

In Belgium you have a few services that seem to be offering services related to the Collector role: ZoomIt and Unified Post. If you look at their websites and service offerings it becomes obvious their primary focus is on the publisher of invoices and not the customer.

Outsourcing Architecture?

The Harvard Business Review Blog Network published a rather interesting article on the reasons behind the failure of the Boeing 787 Dreamliner. One of the main causes seems to be related to how Boeing outsourced work.

Boeing undertook one of the most extensive outsourcing campaigns that it has ever attempted in its history. That decision has received a lot of press coverage, and the common wisdom is coalescing around this as a cause of the problems.

Outsourcing as such is not wrong or risky. Many success stories heavily depend on outsourcing: Amazon outsources delivery, Apple outsources all manufacturing … The key is what you outsource and what you keep internal.

Rather, the issues the plane has been facing have much more to do with Boeing’s decision to treat the design and production of such a radically new and different aircraft as a modular system so early in its development.

In other words, Boeing outsourced some of the architecture of the new plane. That was new for them. So far they only outsourced manufacturing of parts after Boeing designed them internally.

if you’re trying to modularize something — particularly if you’re trying to do it across organizational boundaries — you want to be absolutely sure that you know how all the pieces optimally work together, so everyone can just focus on their piece of the puzzle. If you’ve done it too soon and tried to modularize parts of an unsolved puzzle across suppliers, then each time one of those unanticipated problems or interdependencies arises, you have to cross corporate boundaries to make the necessary changes — changes which could dramatically impact the P&L of a supplier.

This equally applies to IT solutions. If you outsource parts of the solution before you have designed the whole, you’ll end up with problems whose solutions cross supplier boundaries, impact P&L of those suppliers and require contract negotiations. To avoid this, the entire solution has to be designed before suppliers are chosen and contracts signed. That includes the design of new components, changes to existing components and, often forgotten, integration with existing (“AS-IS”) components.

Either you outsource this entirely (all of it, you can’t be cheap and only outsource 95% of it) or first design the whole internally first.

In the creation of any truly new product or product category, it is almost invariably a big advantage to start out as integrated as possible. Why? Well, put simply, the more elements of the design that are under your control, the more effectively you’re able to radically change the design of a product — you give your engineers more degrees of freedom. Similarly, being integrated means you don’t have to understand what all the interdependencies are going to be between the components in a product that you haven’t created yet (which, obviously, is pretty hard to do). And, as a result of that, you don’t need to ask suppliers to contract over interconnects that haven’t been created yet, either. Instead, you can put employees together of different disciplines and tell them to solve the problems together. Many of the problems they will encounter would not have been possible to anticipate; but that’s ok, because they’re not under contract to build a component — they’ve been employed to solve a problem. Their primary focus is on what the optimal solution is, and if that means changing multiple elements of the design, then they’re not fighting a whole set of organizational incentives that discourage them from doing it.

For Boeing it is going to be a very costly lesson. But at least they’ll have a chance to learn. Large IT projects often fail for exactly this reason: modularize a complicated problem too soon.

One last element from the article … why did Boeing do this?

They didn’t want to pay full price for the Dreamliner’s development, so, they didn’t — or at least, that’s what they thought. But as Henry Ford warned almost a century earlier: if you need a machine and don’t buy it, then you will ultimately find that you have paid for it and don’t have it.

They wanted to safe on the development of the aircraft and thought that by outsourcing the design (the ‘tough problems’) they would keep costs low.

Morale of this: don’t outsource pieces of a puzzle before the entire puzzle is known (designed or architected).

How many times did you encounter this in IT projects?