Teams

Abstract

This review of a relatively new approach to work organization by Manuel Pais and Matthew Skelton, Team Topologies: Organizing Business and Technology Teams for Fast Flow, is based on a book by Manuel Pais and Matthew Skelton. The authors examine this approach, focusing on creative approaches, but it is generally applicable to other areas of human activity. The key idea is that the team-agile, dynamic, and flexible-should be the central element of any organization/group/team/whatever. Like the authors, I will examine the idea of team organization from the perspective of an industry close to my own: IT.

AS IS

Currently, we see a wide variety of approaches to organizing interactions between employees within an organization. Much depends on the organization itself; it’s impossible to imagine a universal approach. Firstly, each organization is unique, and it’s simply impossible to replicate its environmental conditions. Moreover, each organization differs from itself at different times. Secondly, an organization’s historical and managerial experience also significantly influences it-they have, in many ways, shaped the organization itself. However, at the same time, it’s impossible not to note a number of similarities that are inherent to different companies, but which unite them into cohorts. I believe that dividing cohorts by organizational hierarchy would be most appropriate in the most obvious and straightforward way.

Small organizations. Rapid communication facilitates high mobility and dynamism;
Large organizations. The opposite of small(s): a large number of interconnected groups of people who have weak ties to each other - this can lead to communication difficulties when interaction between groups is necessary.
Medium organizations. Something between small and large;
Corporations. The collaboration of several organizations (usually large or medium-sized) into a single entity.

According to Conway’s Law, the product of an organization’s activities is its communications structure. This certainly makes sense-communication between people determines how interactions are structured, and the result of interaction is always something complete - what we commonly call a product (or service).

Communication in organizations is often overlooked, relying on the adequacy of all parties involved. However, in their book Improving Performance: How to Manage the White Space on the Organization Chart, Geary A. Rummer and Alan P. Brache note this characteristic of medium and large organizations. Communication voids (they call them “white gaps”) arise between connected groups of people (be it a department, a team, or other group). These voids form quite naturally - it takes time for a person to initiate communication with another person. Two million years ago, communication between disparate tribes in East Africa, competing for scarce basic resources, was not only difficult - they viewed anyone different from themselves as an enemy, invading on their resources and, therefore, at risk of starvation. The modern world, of course, is not the African savannah of 2 million years ago, but human nature (or that of any other organism) does not change so quickly - so modern man is not far removed from the hunter-gatherer from the tribe near Lake Eyasi.

So, one of the key factors influencing the final product are communications. The central element of communication between people is, of course, the individual. But the central element of communication between groups of people is the group itself. And the law is simple: the larger the group, the more complex the communication (not the quality, but the complexity). Below is a formula for estimating the potential number of Communication Channels (CC):

CC = \frac{N * (N - 1)}{2}

TO BE

In the proposed model of communication organization (in Team Topologies), the team is considered the central element. Of course, the term “team” is generally understood to refer to a small number of people - the optimal number is 7 to 9. In his book, Team of Teams, author Dunbar indicates that the maximum number of people a person can completely trust is no more than 15. He also set thresholds for relationships based on the degree of trust between people (of course, all values are averages for the average person):

5 - maximum closeness and trust
15 - can rely on and trust. Applicable term: tribe
50 - can trust, but attitudes are fickle
150 - the effective limit of interpersonal memory
150-1500 - the limit for effective interactions between people (other researchers)

Social Circle

There’s a grain of truth to the idea of such a primacy for teams - they are the “workhorses” for most tasks across a wide range of industries, and for tasks of any size, large or small.

Since the team is viewed as a single unit supporting all work and activities, no task should be assigned to a specific individual. Tasks should be assigned to teams! Within a team, tasks can be divided among specific members, but the entire team is always(!) responsible for completing the task. The same applies to rewards and penalties for team members - they are allocated to the entire team, without distinction based on contribution. This is perhaps one of the few controversial points in the entire team-first philosophy. The budget for various activities is also allocated to the entire team, not to individual members.

In a capitalist economy, the primary goal of any organization is to maximize profit. Any unnecessary costs must be reduced, and profits must be maximized at any cost. This approach is now called Lean Thinking (although it has always existed without such a name). Anything that doesn’t add value (not necessarily in physical terms) should be considered a waste and therefore eliminated.
One component of Lean Thinking for IT, as proposed by the authors of Team Topologies, is the Thinnest Viable Platform, TVP - designing and using only those parts of a product or service that will have a real-world application. This includes equipment, technologies, roadmaps, management tools, documentation principles, and so on - the minimum set of components necessary for launch.

Teams moulding

As a rule, forming a new team as a production unit takes from two weeks to three months (or more). Of course, constant turnover of team members is not taken into account, as it significantly increases this period. Frequent team member turnover negatively impacts overall team performance. Fred Brooks in The Mythical Man-Month points out this, pointing out that the effect of adding a new team member becomes apparent after a certain period of time (there are no clear time limits, of course, as they are highly context-dependent). Brooks emphasizes not only familiarization with technical and organizational details but also the need to establish communication links.

Stages of team formation. The Tuckman Performance Model describes the following stages:

Forming - team assembly
Storming - a period of conflict
Norming - the team’s stabilization phase
Performing - maximum team performance

Team Moulding Flow

The Forming process is the first step, and it’s not as simple as it might seem at first glance. When Forming a team, it’s important to:

avoid the ad hoc team anti-pattern. This anti-pattern consists of quickly assembling a team to solve a small problem and then disbanding it.
avoid mixing teams to achieve short-term goals.

Meanwhile, teams aren’t static entities - their composition, scope, and tasks can change depending on current needs.
One successful team formation model is Spotify’s model: the company’s entire technical(!) staff is divided into small, autonomous, cross-functional teams with clear strategic (long-term) goals. If teams’ scopes overlap, they are combined into subsets (tribes).
Another well-known example is the two-pizza-team (Amazon) - the idea is to limit the team size to the number that can be fed by two pizzas. This composition should ensure the product follows the rule: “You build it, you run it!”.

The Performing phase cannot progress linearly - it also has limits. In his study, Cognitive Load During Problem Solving: Effects on Learning, psychologist John Sweller identifies different types of cognitive load during problem solving:

intrinsic - fundamental/basic aspects of problem solving (“Which architecture should I choose?”)
extraneous - questions that extend beyond the immediate problem but are closely related (“How will the component be installed?”)
germane - specific questions that are only indirectly related to the problem solution and not immediately apparent (“How will this change impact consumers?”)

Each level shifts the perspective on the problem from a narrower to a broader one, shifting the focus from the technical problem to the business solution. To improve team success, it’s necessary to reduce cognitive load at the Intrinsic (through training) and Extraneous (through process automation) levels, and focus efforts on the Germane level - the one that will deliver value.

In addition to the methods described above, reducing (unnecessary) cognitive load is also achieved by reducing the Big Ball of Mud anti-pattern and limiting the team’s scope of responsibility.
During the Performing process, implementation difficulties and schedule delays may arise. In this case, expanding the team or increasing the total number of teams can be considered, but this will entail additional costs. Most practitioners believe that it’s better to rethink the work organization - reduce the simple and complicated cognitive load on overloaded teams and eliminate unnecessary communications. To achieve this, it’s necessary to track and monitor communications. No special tools are required; notepad or Excel will suffice!

When forming a team, we answer the following questions:

what does it take for a team to be effective?
how do we manage our part of the product?
how do we attract more new users?
how do we reduce cognitive load?
how do we use and share information with other teams?

How to assign tasks?

How to assess the acceptable Cognitive Domain limits for a team? First, it’s necessary to determine the type of Cognitive Domain complexity that will be encountered: simple - the solution is clear; complicated - the difficulties are known and solutions must be developed; complex - a significant amount of research is required. Previous experience and comparative analysis between tasks can help characterize the complexity.

Secondly, it’s necessary to estimate how many Cognitive Domains of varying complexity a single team can handle.
After assessing the Cognitive Domains themselves, they should be distributed among teams. Here’s what’s important:

one team should be responsible for each Cognitive Domain. If the task area is too large for a single team, it should be split (it’s important to maintain a balance, as excessive splitting can significantly inflate the budget).
one team can effectively work with no more than 2-3 simple Cognitive Domains.
if a team is occupied with a complex Cognitive Domain, it means it can’t effectively work with other Cognitive Domains.
avoid having two or more complicated domains in a single team.

The authors believe that the architecture of a product under development should be built around Cognitive Domains (that means, not about the business requirements for the product, but not about the technical implementation of the architecture). This makes sense: the more conveniently a product is divided between teams, the more efficiently (and cost-effectively) it can be implemented. However, there is one important condition: a clear understanding of the final product expected as a result is essential.
Eliminating any tasks that increase the team’s cognitive load also contributes to increased productivity (simplification of the workflow, management practices, limiting external communications, etc.).

Team Members

Every team member must put the team’s interests above their own. A mirror act on the team’s part is the removal of personal responsibility from the performer. Therefore, for successful team collaboration, each member is considered to:

participate in meetings
participate in decision-making and determining the direction of movement (for the entire project or a specific area)
maintain interest in the team’s goals
help team members if difficulties arise
constantly strive to identify the best solutions (technical, communication, management, financial, and others).

If a person becomes toxic (for example, egocentric) to the team, it is better to discard such a resource, even if they are technically perfect.
Furthermore, it is considered that the more diverse the team’s composition, the better. Diversity is encouraged not only in terms of technology familiarity, but also in terms of interests, culture, languages, gender, age, and other basic characteristics.
Diversification is also important for the skills of team members - the more specialized the project’s specialists, the higher the risk of bottlenecks. In general, diversifying a team’s skills and experience is very beneficial - it allows for simpler and more effective solutions.

It’s a good idea to include team members with expertise in a variety of fields, not just technical specialists. This includes people with coaching, management, process improvement, documentation skills, and other skills.

Each project section should be assigned to a separate team. This will prevent responsibility for the performance of a specific section from being diffused across teams. Furthermore, strict segmentation and definition of areas of responsibility will improve manageability not only of specific sections but also of the project as a whole (across various areas). This principle applies not only to IT projects but also to other industries. At the same time, assigned sections should not be viewed as a burden or obligation. Teams must be instilled with a sense of ownership over the results of their work.

Types of Teams

The authors of Team Topologies divide teams into four types:

Stream-aligned - a team performing a single task at a time (this could be a product, a set of enhancements, or a small area of the product). The priority is to deliver the product to the customer as quickly and reliably as possible. This team is closest to the customer and is required to communicate with them regularly and in near real-time.
Enabling - a support team assisting the stream-aligned team in delivering the product (by identifying defects, suggesting best practices and technologies, resolving difficulties, etc.)
Complicated-subsystem - develops and supports highly specialized system components that are very complex to implement, but without which the product would lose its value.
Platform - mitigates the negative impact of simple and complicated cognitive domains on stream-aligned teams. This could include, for example, providing tools for automation, monitoring, and auditing.

Their interaction during product development can be depicted as follows:

Team Types

This approach to organizing the development process assumes that each team has its own product (this doesn’t mean they develop it themselves; teams can use existing products) that adds value to the primary goal of satisfying user demand. Interaction between these products is assumed to be achieved through an API with clearly defined principles and rules. A well-designed platform UX, or DevEx (defined by the ease of getting started with a third-party platform), will attract a large audience.

Stream-aligned teams

Thees are viewed as the primary producers of products and services, a kind of “workhorse.” The word “stream” in the team’s name is not accidential - the authors believe that product or service delivery should be streamlined, covered into the Continuous Design and Continuous Improvement patterns. Effective stream-aligned teams are characterized by the following behavior patterns:

stable delivery of value
rapid product and service adaptation to current requirements
continuous product improvement, personal training, and adaptation to modern requirements
avoid shifting responsibilities or tasks to other teams
maintaining an appropriate level of product and service quality, continuously improving it
liaising with other types of teams and assisting in improving their products and services (not taking responsibility, but facilitating improvement)

Typically, the ratio of stream-aligned teams to other teams in companies ranges from 6:1 to 9:1.

Enabling teams

They are support teams that help identify problems and shortcomings in stream-aligned teams to improve their effectiveness. Enabling teams should not be permanently tied to the stream-aligned team! They should provide sufficient support for the stream-aligned team to operate autonomously in the future. After training the team to a sufficient level, the enabling team moves on to the next team and implements the improvement appropriate to its context. This is achieved through transparency and clear communication. Typically, enabling teams focus on a small number of areas of assistance, but they should not be limited in any way. The following behavior patterns are expected from enablers:

continuously seek improvements for stream-aligned teams
review and propose the implementation of new tools, approaches, and practices to speed up stream-aligned work
provider of good news (“You can use X to speed up X by n%”) and bad news (“Library X is not developing, it’s better to abandon it”)
share experience and knowledge by serving as mentors

In short, they help deliver products and services faster and with higher quality.

Complicated-Sybsystem teams

They develop complex subsystems (product or service areas) that are highly dependent on competencies. They also serve as support teams for stream-aligned teams, although they may also take on the development of the most complex areas of the entire product, without which the product would not function. For example, a complex encryption algorithm, a financial transaction guarantee system, real-time trading algorithms, and so on. Expected behavioral patterns:

close collaboration with stream-aligned teams in the early stages
deep development of the subsystem in later stages
assistance with stream-aligned teams to prioritize and implement their needs.

Platform teams

Theese teams provide support services that help reduce the cognitive load on stream-aligned teams (others, to a lesser extent). Platform teams should view the results of their work as a ready-to-use product, regardless of whether it’s for internal or external customers (in other words, the product must also be competitive in the external market). Products must be proven, reliable, and efficient - platform teams prioritize quality over quantity! Some companies use an internal marketplace for services provided by various platform teams - this allows them to evaluate the quality of platform teams’ work and the value of stream-aligned products and services for the company. Expected behavior:

close collaboration with stream-aligned teams to meet their needs
fast and high-quality service delivery for stream-aligned teams - a pipeline model for delivering well-known and well-researched services
quality and stability are a priority!
adopt new technologies and practices cautiously

Community of Practice, COP

This is a group of professionals that helps increase knowledge and capabilities among teams. A COP can include representatives from different teams to share accumulated experience and knowledge on solving various problems. COPs typically do not provide basic training; they are aimed specifically at solving problems of medium and high complexity.

COP vs. Enabling Teams

These groups differ (although the same professionals may participate) – a COP focuses on solving known, common problems across a large number of teams. Enabling teams, on the other hand, work with teams individually, based on their needs.

Incident Team

To resolve incidents (incident teams), it is recommended to involve specialists from various team types. Moreover, it is advisable to determine the composition of the incident team in advance.

When to transform?

Transforming team structures (partial or full) can generally be very beneficial – it provides additional experience for the team. However, certain events sometimes arise that indicate the need to reform the organizational structure, in particular:

team growth (exceeding the 7-15 member limit; resource calendar overlaps; increased complaints)
delivery delays (noticeable increase in delivery times; metrics show a decrease in team efficiency; complaints from team members about the need for optimization)
increasing component dependencies (supply chains become longer due to codependencies; service reuse becomes too expensive; business logic becomes opaque)

When discussing changes, it’s important to emphasize that they shouldn’t be random. Therefore, analytical tools are essential. These tools can include metrics. Moreover, metrics don’t necessarily have to be indicators in a task tracking system (e.g., Jira). The system itself can also serve as metrics - it can collect business metrics that can then be taken into account. The more metrics collected, the easier it will be to identify areas for improvement (both in the application and within teams).

During the Performing process, it is necessary to continue improving not only the product, but also the team management, including methods and technologies.

Communications

During the Performing process, teams must continually receive feedback from users (not necessarily key stakeholders or customers).
In addition to interacting with users, the team must also communicate with other process participants, particularly DevOps (individual engineers or the entire organization). There is a number of recommendations for communication between Dev, Ops, and DevOps, described in detail here. To accelerate communication between teams, it’s essential to take advantage of any opportunities that arise!

Some companies use Site Reliability Engineering, SRE - an approach to software development and operation. SRE engineers utilize a full range of practices and tools that promote the reliability, automation, and sustainability of developed products (SRE engineers can support several projects in parallel). Their responsibilities also include providing recommendations regarding supported products to improve performance (including software product reviews), self-healing, and self-management. These specialists are very highly skilled and have extensive experience in developing complex systems. Depending on the development stage, the role of SREs may vary: at the product initialization stage, their role is completely absent, since the product’s future is uncertain, and engaging such specialists is too expensive; the formation stage is characterized by the involvement of SRE specialists, but only partially - they are sufficient to provide general recommendations based on the obtained metrics; the development stage is characterized by extensive SRE involvement - at this stage, the product is considered mature and has a wide audience, and the SRE’s responsibilities include maximum optimization and automation; the transition to the transition period indicates that the development team and SRE specialists have achieved target indicators, after which SRE specialists can transfer authority to the operations team.

Communication between teams

The authors of Team Topologies distinguish three main types of interaction:

Collaboration

Close interaction between teams (effectively turning two teams into one). This is useful for short-term projects, when a technology needs to be quickly implemented, or in the early stages of product development. Long-term interaction between teams in this mode leads to decreased efficiency and blurred lines of responsibility.

+ rapid research
+ accelerated interaction
- blurred responsibility
- decreased efficiency
- increased cognitive load

X-as-a-Service

Teams view each other as product or service providers. Interaction should be accomplished using a well-documented and structured API. Versioning with backward compatibility is also essential! If a feature needs to be added, the owning team should not immediately implement the requirement, and the consuming team cannot enforce it! Since a service may have multiple consumers, the owning team must analyze and prioritize the tasks to be implemented.

+ clear areas of responsibility
+ limited cognitive load
- slow interaction
- defects in functionality or inaccuracies in the API description

Facilitating

A supportive relationship in which one team provides assistance to another unilaterally. The primary communication model for enabling teams. Typically, teams engaging in facilitation communication with other teams operate in this mode with a large number of teams.

+ supporting stream-aligned teams
+ identifying and communicating areas for improvement
- requires an engaged and experienced team
- may cause discomfort for one party

Communication between teams gradually evolves and flows from one mode to another. This depends on many factors, but primarily on the developmental stage of the teams’ interactions.

Communication Approaches

Command and Control. Assigning a task to a performer and monitoring its execution. Based on hierarchy, this approach is considered outdated but has a number of positive qualities
Promise Theory Model. A task is defined for a performer, after which they agree (or not) to implement it. During the discussion, additional constraints (e.g., deadlines) are developed, and the performer provides performance guarantees. At the same time, risk response strategies are developed, which depend on previous experience working with the performer. This approach is more flexible and modern. However, it also has a number of drawbacks and is difficult to implement for complex or critical tasks

In IT (and more specifically, coding), pair programming, mob programming, or whiteboard sketching can be useful for improving collaboration within a stream-aligned team - all of these can be combined with one of the DevEx development methods. It’s best to conduct sessions not in the abstract, but on real-world tasks.

Similar sessions can be conducted for facilitating teams, but the implementation should be specific to the tool the facilitating team is implementing or proposing to implement. Again, the session shouldn’t be tied to a specific project - it should have a real, tangible application. For example, a session on a new testing tool for Application X.

Communication between teams doesn’t emerge spontaneously; it’s shaped by people. A product or service architect can establish an approach to interteam communication. While it’s commonly believed that their responsibilities are solely to define technical solutions, architects inadvertently shape interteam communication, meaning they must evaluate both communication and the cognitive load on teams during the technical design phase.
Any interaction between teams is expensive. And the more useless it is, the more expensive it is. Close communication is necessary over short distances, but it should be kept to a minimum.
Sometimes, to improve collaboration, small teams are created on both sides for close collaboration, while remaining part of their core teams. This facilitates the transformation of a small portion of the team from one type to another (for example, from stream-aligned to enabling).

So, in addition to a team-oriented organization, a properly selected System Architecture is required for rapid software delivery. Over decades of software development, engineers and architects have already developed a number of successful practices:

loose coupling - components should not be rigidly coupled to each other; connections should be flexible
high cohesion - one component solves a single problem. This problem should be strictly defined and its boundaries cannot be violated
version compatibility - appropriate versioning
cross-team testing - stated principles of inter-team collaboration (in particular, testing and incident resolution)

Performance

Some IT project management professionals believe that high IT productivity can be achieved in several ways:

systems thinking – the entire company is optimized, not just individual parts.
feedbacks – information from the business is received quickly and efficiently.
continuous improvement and training in new technologies and methodologies.

Feedbacks, (point 2) deserves special attention. The business should be the “hidden architect” of the system, shaping it through feedback to the development teams. The business must provide feedback and notify of any “bumps” it encounters. When implementing a new feature, A/B testing can be used to obtain metrics from the business for both solutions. Highly qualified specialists can also be assigned to service desk positions to identify system deficiencies (at least for a short period of time).

Monoliths

Some experts believe that the technical aspect of monolithic solutions is the most detrimental. However, this is not entirely true. The authors of Team Topologies particularly emphasize the communication difficulties of such architectures. If the architecture tightly couples components, this deprives teams of autonomy and agility, and diffuses accountability.

Types of Distributed Monolith

Application. A large application that integrates all system functions. It has a large number of dependencies
Joined-at-the-database. A number of applications or services linked by a single database
Builds. All system components must be built simultaneously to ensure stability. Although the components are formally independent, they still effectively form a monolith. This type of monolith is limited to a shared build
Coupled Releases. An addition to the previous type of monolith: in addition to a shared build, it also requires mandatory installation of all (or some) components each time
Single Model. Widespread use of models and views for all components, unifying them not by content, but by their view of the environment
Standard. Applying a unified approach to all system components and even other systems within the company. This approach has both pros and cons. But the authors believe that any monolithization is harmful

It’s important to avoid a distributed monolith - an architecture in which parts of the system are seemingly separate but still tightly coupled. This still makes system components dependent on each other.

Fragmentation

To split a monolith into several parts, it is necessary to define a Fracture Plane, which is usually obvious and stems from the specifics of the business logic or system.
It is best to follow the principle of separation along the business domain boundary. This approach is presented by Eric Evans in his book, Domain-Driven Design. Defining component boundaries in the DDD approach requires extensive knowledge of the business domain and technical expertise.

Areas for Improvement

To improve team performance, and therefore the quality and speed of delivery, the aforementioned knowledge is not enough. The entire company (or more precisely, its approach to delivering products or services) must change for the better. This can be facilitated by:

a healthy organizational culture - supporting the desires and aspirations for personal growth of teams and their members
application of best practices, both technological and managerial, such as an approach to continuous product improvement, mobbing code reviews, avoiding in-depth analysis of isolated incidents, and others
meaningful budgeting - CapEx and OpEx should be targeted and justified
a clear business vision - the clearer the business’s understanding of what a product or service requires, the more successful the project will be

First, it is necessary to understand the key business areas. Classic business market analysis is appropriate here.