Amplify (Individual) Learning

As I (finally) started reading Lean Software Development from Mary and Tom Poppendieck [1], yesterday I was reflecting on the literal meaning of the second principle: Amplify Learning.

Totally unrelatedly, some days ago a discussion at work had started around making sure we keep learning and encourage everyone in the company to do so. As knowledge workers, learning is indeed a critical part of the career management of all of us.

So reading this "amplify learning" clicked to me, and this got me thinking that in current team, I believe we are not really good at making sure that when people learn things (an useful API, a nice way of solving a recurring problem, or something new that could prove interesting to the bigger team) when working on a Story, this is then systematically shared to the rest of the team.

For instance during a Friday lightning talks session, or something similar.

Is Unshared Learning Waste?

swarf

The first Lean Software Development principle is Eliminate Waste. Oh wait, I thought, that’s interesting.

I started scribbling a schema with my 5-year-old capabilities to represent the indirect "result" that individual learnings are lost if unshared. Basically, if we are looking for an analogy, these "lost/unoptimized learnings" could be seen as a by-product, or worse as swarf. [2]

optimize learnings

Condensing boiler

I thought about the condensing boiler I have in my house. Roughly, this type of boiler collects back the heat of the burnt gas by condensing it to water and reinject it in the circuit. [3]

This allows that type of boiler to have a better efficiency. In other words, for the same amount of gas, that kind of boiler will produce more heat.

You see where I’m going right?

Share learnings in the Definition of Done

So I guess it is now clear I posit that not sharing learnings with the full team in a systematic manner is a lost opportunity :-).

Obviously, the analogy could be seen as falling short on the aspect that the learnings are not lost for the developer who went through it?

Well, I am inclined to think that this still would be very suboptimal. I believe we all have realized too often how trying to explain something to someone else was revealing how shallow our understanding actually was.

Hence agreeing as a team that for each Story, a lightning talk [4] will for example be delivered sounds like a good way to make sure the knowledge will be forced to be structured, understood deeply and magnified for the benefit of the whole team. [5]

Recap

In my experience, doing this kind of sharing on a voluntary basis, during brown bag lunches for example, has always stopped more or less quickly. The pressure and the day-to-day life we all go through generally tends to come back naturally after some time and make this habit fade out progressively.

By making it part of the team’s definition of done, it sounds like we could amplify individual learnings to the full team (and more! [6]).

For instance, I would imagine something like a 30-minutes session on Friday mornings after the team standup. Each presentation would be maximum 5 minutes, presented by people who completed Stories since the previous Friday.

What do you think?

Note
If you tried something like this, or this is a practice still even in place in your team, I am very interested to hear your thoughts! Do you like it, what did or did not work? Etc.

2 Thanks a lot Owen and James :-)
3 Note however that IANAP: I Am Not A Plumber ;-)
4 or a blog entry, or both!
5 On top on helping teammates grow their public speaking skills.
6 recording could even have an impact company-wide, though it might add an unnecessary pressure to people

Challenge Your Team, With Deep Care

Over the last years, I have come to read and reflect more and more on some aspects of great teamwork. There is now quite a corpus of litterature discussing what makes high performing teams.

Feeling safe


took me literally years to discover and understand why sane conflict in a team is not only good, but even required.

Care and Challenge.

Getting to know a bit more what you don’t know is progress right?

I must say I love this.

I got off-the-band feedback talking about a case at a well-known company where a person on a leadership role would challenge a lot teams to do bolder moves, think more strategically etc. People felt attacked, and even threatened.

To be clear, I am not arguing that this behavior is desired. I am actually saying this behavior is the exact opposite.

Challenge is needed, for sure. But it cannot come before trust is established, or worse things will happen.


One critical finding was discussed in Duhigg’s Smarter, Better, Faster [1]: the People team at Google was trying to find out what were the criteria for high performing teams. After shaking data, they found that the one criterion that was shared among all these teams were the so-called psychological safety.

being able to show and employ one’s self without fear of negative consequences of self-image, status or career

Or, as written in Smarter, Better, Faster:

is a shared belief, held my members of a team, that the group is a safe place for taking risks.
[…] sense of confidence that that team will not embarass, reject, or punish someone for speaking up

In other words, you know you can make mistakes, and that it will generally not threaten your job.

And while so many crappy companies or crappy managers still stupidly punish mistakes, we all know very well what happens: people will not stop making mistakes, they will hide them. And everybody ends up suffering: people and the business. What for then, FFS?

Safe, and sane emulation

Mistakes. Mistakes, and taking risks. Together.

radical candor

Another famous book discusses the importance of challenging your teammates: Radical Candor. The framework is impressively simple and powerful to understand why caring and challenging are so critical. The book is literally stuffed with insightful, and often funny, anecdotes illustrating many points.

Kim Scott spends in particular a good amount of pages discussing the case of ruinous empathy, deemed as the most common mistake when working together. Basically you are nice with someone, very nice. You want to be nice, and to not hurt people, so you do not tell them what is going wrong. And if or when things go bad, people get mad at each other because of unvoiced problems, or worse people end up fired. [2]

Develop relationships

This is where the crux actually lies. To be able to challenge, there must be a strong relationship. Challenging someone is hard.

Imagine a long-time friend comes and tells you:

Hey, I think you made a big mistake there on XYZ

Likely, you’d be genuinely interested to hear what you did bad.

Now, picture yourself in a similar situation, but with someone you’ve only been working remotely with for one month, and never met in real life. That person you only saw roughly once per working day during daily standup calls and various team rituals with the full team.

Which one is more likely to trigger a immediate bad reaction? Which one is at least going to shift your mindset to a defensive mode instead of a learning one?

This all comes down to trust. This all comes down to knowing that showing your weaknesses won’t be held against you. And to build trust, one has to work on relationships.

And if you want data to back this discussion, building trust inside and across team is also recommended in Accelerate [3].

Conclusion

Care deeply about your coworkers, and challenge them.

I would even say: really caring about your teammates implies you will make the effort to challenge them. They will trust you because they know it is not to hurt them, but quite the contrary. People will know you expect them to do the same. [4] And overall, the team will grow together.


1 https://charlesduhigg.com/books/smarter-faster-better/ chapter 2. Thanks Tyler for this one ;-)
2 Which she finally had to do, and was deeply ashamed of. "But why didn’t you tell me?"
3 Accelerate: The Science of Lean Software and DevOps: Building and Scaling High Performing Technology Organizations, by Nicole Forsgren, Jez Humble and Gene Kim
4 and make sure to praise people who do

On Story Points

I see many people complaining about the crappy tool that story points are. I agree, this tool is very imperfect. We still however consciously make the choice to use them in our team, so why?

Clarifying scope as a team

Basically, using story points estimation is the less worst tool that we have handy as a way to trigger discussions as a team.

We require Acceptance Criteria, and try to refine them until we have estimations that are closer with each other during the planning poker.

If someone in the team says 2 and another one 8, we will never average the value and move on. We will use this difference as showing that we need to discuss more and refine the scope together.

Often it will mean rephrasing or adding an acceptance criterion, or even in some cases adding some "anti" acceptance criterion clarifying that something is NOT in the scope of a given chunk of work. (E.g. we could decide to exclude testing a specific deployment target among many as a first step, because we know it will jeopardize the probability to achieve the work in the Sprint)

Smaller chunks of work is always a target.

Are points important then?

No.

We often reaffirm this during our planning meeting sessions: our goal is to agree of the scope of work as a team. Estimating is an imperfect way we have to achieve this goal, but points are not and will never be the goal by itself.

If we find a better way to trigger discussions and reach together to an agreement, we will throw story points away without ever looking back.

Either small, or too big

I have experimented in my past that we would agree on having either small stories, or too big. Removing the blurry zone in the middle had helped us avoiding accepting more dangerously uncertain tasks in our Sprint.

This came up again recently to only accept 3-pointers maximum. This was because 5 or more in our [1] scale ended up being generally dangerous.

I think this is probably where teams should look first, if the goal is to try and be more predictable. In other words: either you would have a Sprint with only small, or very small stories, or you work on splitting the bigger ones so they can enter a Sprint and start being delivered.

But I have big stories that can’t be split!

kniberg testable usable lovable

Yeah, we are all special. I believe we all have, starting with me, this tendency to think "oh well, this thing is what we need, cannot be really split".

But I think this is generally laziness (again myself first) on spending more time figuring out how to split work in an earliest testable/usable/lovable manner for the customer.

So, counting points?

I think this is where the crux actually lies nowadays. We see various people complaining about Story Points being bad, but I have yet to find proposals from these same people what to do instead.

I like the idea in this ThoughtWorks article to use story counts for planning, but keep having estimation sessions as a vehicle for team conversation:

  1. We still maintain our estimation sessions. We highly value the team conversation catalyzed by gauging the size of the work.

  2. Leave the estimate points as a reference on the card, which could help inform prioritization. But we do not translate those numbers into scope or capability.

  3. We started using story count in our burn-up charts.

We need to dive into more of our Sprint data, but my gut feeling on the amount of stories we generally deliver tells me there’s some interesting truth or at least insights in this reflection.

Conclusion

As it stands now, I think the estimation sessions are still the best imperfect tool we have to refine and agree together as a team on the scope of a work item.

I am however eager to hear any counter-opinions or alternative practices to Story estimations for this purpose.

On using these points to do the planning, I will most probably reflect more on switching to story counts. At least I definitely like this idea, because then it would reaffirm even more that story points are neither important outside of the team, nor once the planning meeting is done. [2]


1 I.e. there is no company-wide agreement of what 5 points means, and there won’t ever be obviously. This is an undesirable and dangerous idea.
2 Writing this article actually helped me get my thinking further. So this does confirm to me that writing is important, and that Meta blog was important to commit to.

Bodybuilding For A Software Developer

Around the beginning of 2017, as I had started working for CloudBees the previous August, I realized I hadn’t found a way yet to really get back to Sport. At some point, neither was I feeling very well, nor was I happy with how I looked, and decided to do something. I think I was also starting to realize I was not that young anymore, and that nobody else than me could work on this. Probably some form of the mid-life crisis, I suppose. With 2 kids back then, soon-to-be 40 years old, and realizing that one day, yes, I am going to die…

After a few weeks of procrastination, around March 2017, after I was coming from a week skiing and the associated tartiflettes and the likes, I decided to start going to a gym nearby my coworking space

I was working in a fully flexible company, and hence had the opportunity to go at unusual times, for usual French companies. So I started thinking: "oh well, going to sport after dropping kids and before starting my day, or in the middle of the afternoon, would be nice, isn’t it?".

My first year

So I started going, and in parallel, I started watching fitness youtube channels about how to do things right. (Lucas Gouiffes, Nassim Sahili, Jean of All Musculation, Enzo Foukra, Jeff’s Athlean-X, etc. [1]) I started to learn about nutrition, the right number of reps & sets depending on your goals, the so-called mind-muscle connection, full-body, half-body, split, the name and functions of various muscles, etc. (spoiler: I’m still bad at this). It was great. I was litterally starting to realize what some muscles were designed for.

And because I was a beginner, liked it, and I was committed to it, I got great results. I got better than my previous self in just a few months. Between mid March 2017, and June 2017, I trained 42 times. In that same period, I lost close to 7kg. [2] When I went to Devops World - Jenkins World 2017 in August, I was so addicted that I trained at the Hotel Gym every mornings. I even ended up creating a ContinuousSport Instagram account, where I didn’t post that much.

Hard stop

scapula fracture

After one year of practice, I went to ski again. After a stupidly missed turn, I got in the air, and landed hard on my left shoulder. I had broken my scapula, and had to stop using my left arm for a few weeks. However, the upside was that I had that awesome 3D reconstruction of a scan of my scapula to brag about.

After a few weeks of no training at all, this was finally time to start my path to normal.

Re-education

I had been practicing bodybuilding and overall weight lifting for one year already, with a special focus on execution quality after the advice gathered from various experts and coaches on YouTube. If you know what re-education generally looks like, you know that repeating what the physiotherapist told you is going to be critical for getting back to normal life.

Given how I was craving to get back to training, I leveraged my training to get the full mobility of my shoulder in a few weeks.

Some backstory, and fast-forward to September 2019

Years before this, around 2012, during a badminton match, I hurt my back badly and got a terrible lumbago. I think it was just the consequence of me progressively but drastically reducing the amount of sport I was practicing, between 20 and 30 years old, while doing a lot of renovation work during 2+ years for my home. So I went to a specialist, and he explained how the spine worked, and specifically vertebras. I almost had reached the point of a hernia, but not yet. I spent a number of month being keeping my back very straight. But I ended up with a mixed result of no pain anymore, but a definite feeling of a weak back and not being able to trustfully fetch things on the ground, or simply bathe my kids.

After practicing all sorts of pushes and pulls, especially deadlifts and associated movements for the area, I feel much better nowadays. I do feel strong, or at least way stronger and balanced than I was before. I can fetch my getting-heavier kids without efforts, without feeling pain or stress about doing so. Which is awesome for day to day life.

Conclusion and what’s next

So here it is. I am doing bodybuilding, and I like it.

I am now convinced that this is a very nice sport to stay healthy in the long run (meaning, more than running for instance), especially for a knowledge worker who stays most of her or his time seated down, in front of a computer.

My current plan is to keep practicing as long as I enjoy it. I am also going to add a bit more cardio-training for a better global health. This is something I’ve not done enough in the past months, and is also encouraged to be very important too to perform in your day job in HBR’s 10 Must Reads on Managing Yourself. That might be an interesting followup for this article :-).


1 I only nowadays still follow Nassim, and watch some of Athlean-X videos, though I try to stay away from YouTube
2 As you sometimes can read, losing so much weight and gaining muscle at the same time is a one-time priviledge reserved exclusively to beginners. TL;DR: you normally cannot lose weight so fast if you also want to grow muscle.

Meta

I want to go back to writing. I am convinced that writing helps taking things out and is a good exercise by itself.

The frequency of postings here has gone way too low for my taste.

With this entry, I am publicly committing to publish at least one entry per week during September 2019. [1]

Many things are ongoing on a personal and professional level that make blogging harder, but clearly that is also mostly a bad excuse. I definitely spend enough time on things that are unimportant to me (Twitter and YouTube, looking at you) that I can recycle some of this time to blog instead.

Given my career change, and that I have also been reading more lately, I expect overall this should provide good matter for reflection, and hence blogging.

This will somehow by definition be less technical, but I will also keep talking about my various technical findings. For instance, after the closing of CrashPlan for personal use, I have spent a fair amount of time experimenting various backup solutions a few months ago. This is now running incrementally in the background every 15 minutes, using Restic.

Stay tuned and see you soon! :-)


1 Yeah, not implying there are people out there actually caring about my commitment, but using this way to somehow force myself to not forget.

15000 push-ups, OKRs, and a reflection on goal setting and team dynamics

Around last September, I registered to a 15k push-ups challenge organized by Heinz Kabutz. The challenge was simple:

  • Bronze: 1000 push-ups during October 2018

  • Silver: Bronze, plus 2000 push-ups during November 2018

  • Gold: Silver, plus 3000 push-ups during December 2018

  • Platinum: Gold, plus 4000 push-ups during January 2019

  • Diamond: Platinum, plus 5000 push-ups during February 2019

As I was finishing my last ones of the Diamond level a few days ago, and I was finishing Radical Focus [1], I started thinking about motivation.

Hard, but not impossible

Since a bit more than a year, advised in part by Tyler, I started reading more around leadership and management.

In Smarter Faster Better [2], there’s an awesome story about a study where a group of people would be asked to run 100 meters in 10 seconds. Then, they would be be asked to run 200 meters, still in 10 seconds.

Both are obviously hard, or impossible for mere mortals, but still we all know 200 meters is just impossible for humans, period.

Guess what they found: when trying to run 100m, people would cover significantly more distance in 10 seconds than when aiming for 200m. In other words, it demonstrated than if you give a clearly unreachable objective to people, this will result in reduced performance.

Back to push-ups

When I started this challenge, I thought I would definitely do the first month (30 push-ups a day in average), and likely the second one in November. Unsure for December and January, and then quite certain I wouldn’t be able to keep going until the end in February, where it would average 179 a day!

But as I finally did it, I think there are two learnings here.

Progressivity is key

I am pretty sure I would have failed if I had to start directly at 5000 a month. Going with 1000 more each month had me trained, both physically and mentally, for the next stage.

Linking back more to my usual job, I feel this can relate for instance be related to the common adage Start small. I also cannot help thinking about some convolution around John Gall’s law: if you start too high, you are likely to get burned and have to start over.

Team work

That one might be less obvious, but definitely played a role in my case. The fact I knew I was not the only one going through all this certainly made me make more efforts especially on days where I felt less motivated. This seems crystal clear to me: had I started this challenge on my own, alone, and it would have been orders of magnitude harder to stay committed to it. I want to see a link here between this and the work people usually do in a team, from which they derive more energy and fun to succeed together.

Objective and Key Results

In Radical Focus [3], Hanna, Jim and Jack discuss OKRs, and defining a quantitative Key Result for the Objective they’ve just settled on.

[…] Hanna continued, "Like reorders at 30%?"

Jim jumped in. "OKRs need to be hard goals, The kind you only have a 50/50 shot of achieving […]"

Jack jumped in. "100% reorders!"

Jim smiled."Is that possible? It can be upsetting to set a goal the team knows they cannot achieve."

page 33
— Radical Focus

Conclusion

I am happy I finally put this out.

As often in sociology I feel, this kind of things looks obvious after the fact, but it is not.

Also very happy to have experimented this in real life. It’s like common mistakes in Software Development: you get taught to avoid some well known ones. But still being able to afford making some yourself is going to prove soo much more useful for the long term.

Define hard goals, but not too hard. Iterate and raise the bar as you go, in the long run you will go much further and higher.


1 Thanks Isa!
2 Writing this without the book handy, so I hope I didn’t mix books. Will check when back home.
3 By the way, another great book where the first part is written like a novel. Similar to The Phoenix Project, or The Goal in this regard.

Throttle network bandwidth on Linux

Recently, as some of our tests for Jenkins Evergreen related to file downloading were a bit flaky and failing more often, and hence slowing down unrelated PRs, I decided to dig into it to see how/if I could improve the situation.

Main issue seemed like one test downloading a 5 MB file could sometimes time out (i.e. take more than 120 seconds). It looked like the network, though everything is running in The Cloud ™, would sometimes temporarily and randomly get into very low downloading speeds.

So I decided to check how these tests would behave on my own laptop with a low-speed Internet connection. Given my own Internet access is decently good, I needed to find a way to reduce my speed and artificially simulate a worse network.

Note
For the record, I’m using a Linux Fedora 28. Kernel is 4.18.7-200.fc28.x86_64. Trickle is version 1.07, and tc is iproute2-ss180813.

The Tools That Work

After a bit of research, I found two candidates to achieve this: trickle and tc.

Trickle

Download metrics

We are going to use curl here, and its neat embedded measurements feature. (See this article for more context.)

export CURL_METRICS="-w '\nLookup time:\t%{time_namelookup}\n\
Connect time:\t%{time_connect}\n\
PreXfer time:\t%{time_pretransfer}\n\
StartXfer time:\t%{time_starttransfer}\n\n\
Total time:\t%{time_total}\n'"

In practice

Trickle is very nice because it is a simple and effective tool for the userspace.

Without speed limitation:

$ curl -sL $CURL_METRICS https://updates.jenkins-ci.org/download/plugins/artifact-manager-s3/1.1/artifact-manager-s3.hpi --output plugin.hpi
 '
Lookup time: 5.828146
Connect time:   6.355080
PreXfer time:   6.765247
StartXfer time: 7.402426

Total time: 11.861929
'%
$ echo "c3c3467272febe6f7c6954cc76a36f96df6c73f0aa49a4ce5e478baac1d5bf25  plugin.hpi" | sha256sum --check
plugin.hpi: OK

With speed limitation using trickle:

$ trickle -s -d 100 curl -sL $CURL_METRICS https://updates.jenkins-ci.org/download/plugins/artifact-manager-s3/1.1/artifact-manager-s3.hpi --output plugin.hpi
 '
Lookup time:    7.536724
Connect time:   9.017958
PreXfer time:   9.428549
StartXfer time: 10.065741

Total time:     37.035747
'%                                                                                                                                                                                                                                           $ echo "c3c3467272febe6f7c6954cc76a36f96df6c73f0aa49a4ce5e478baac1d5bf25  plugin.hpi" | sha256sum --check
plugin.hpi: OK

That works great. If this does not, see below :-).

Traffic Control

I’m a lazy bastard. So you should wonder: why the hell would I look for another tool if trickle does the trick (pun intended)?

Well, trickle, somehow like nice or strace for instance, acts like a wrapper. It rewrites calls to the libc so that it can throttle what the process can achieve in terms of network performance.

That means that if the process does not go through these calls, the process will not be throttled. This will for instance be the case for processes that are statically linked, or those who fork themselves. Or the ones like Docker, where the heavy network calls are actually done somewhere else (the daemon), not by the docker CLI calls. Or more generally any process that use alternate ways to download things than going through the standard libc functions.

In that case, another solution is to set the rate limiting on the network interface itself. Here comes tc in the game.

tc is a very powerful tool. It has pluggable bandwidth shaping strategies: you can for instance simulate a flaky network, with say 10% of loss, define the maximum download rate, etc. I do not claim at all to be an expert in this field, quite the contrary. The command I’m showing below are the ones that worked for me, so use them wisely, and please do not hesitate to reach out if you have comments or improvements to propose.

To apply the constraint on this interface, tc can be used this way [1]:

commands to run as root to limit the download rate
export IF=eth0 # the interface you want to throttle
export U32="tc filter add dev $IF protocol ip parent 1:0 prio 1 u32"
export MAX_RATE=30
tc qdisc add dev $IF root handle 1: htb default $MAX_RATE
tc class add dev $IF parent 1: classid 1:1 htb rate $MAX_RATE
tc class add dev $IF parent 1: classid 1:2 htb rate $MAX_RATE
$U32 match ip dst 0.0.0.0/0 flowid 1:1
$U32 match ip src 0.0.0.0/0 flowid 1:2

To remove the whole configuration:

tc qdisc del dev $IF root
Note
For easier usage, especially to easily run all commands usinig sudo, I recommend you shove these in a dedicated shell script. See mine for example.

Gotchas

Exact maximum with tc

For some reason, I was unable to get tc to actually be under the limit I specified. However, I was able to drastically reduce my bandwidth, even if not very precisely, to test how my code would behave with a low speed network. And it was enough for my needs, so I didn’t dig further.

But if someone can explain and knows if this possible, and how, to define a strict maximum, I’m all ears :-).

Quick note about Wondershaper

Wondershaper is a tool name that you might stumble upon while crawling the Internet. I did. I tried it before writing this article. I even started writing the article talking about wondershaper instead of tc before ditching it all.

TL;DR: do not install or try to use it.

Wondershaper is actually an outdated 169-lines shell script, somewhat a bit like the one we wrote above. In my tests, wondershaper rate limiting always resulted in a much lower actual maximum than the one expressed. Hence, if you need a crappy script, I guess you’d rather have written the four associated lines yourself to get it sorted if need be, than having to debug a longer one from someone else (who left the boat long ago :-)).

Do not just believe me, go read this detailed article about it: Wondershaper Must Die.


1 The example is using the htb discipline.

How to run and upgrade Jenkins using the official Docker image

For some time now, I’ve been trying to follow and answer questions arising in the official Git repository for the Docker image of Jenkins.

I have especially been trying to encourage people to move away from using bind mounts [1] and prefer volumes instead.

Running Docker Containers

Ideally, you never restart a container. You just start a new one from the same (or another) image.

Anything you want to keep has to be in the declared volume(s), that is all you need.

How to run Jenkins official Docker image and keep data

Tip
jenkins/jenkins is the official repository on Docker Hub for the Jenkins Project. The jenkins and jenkinsci/jenkins images are deprecated.

I suspect you’ve come here just to copy and paste commands and move on. We all do :).

So, here you are. Let’s imagine I want to run Jenkins 2.107.3, here is how you would do it for simple production usage.

docker volume create jenkins-data
docker run --name jenkins-production \
           --detach \
           -p 50000:50000 \
           -p 8080:8080 \
           -v jenkins-data:/var/jenkins_home \
           jenkins/jenkins:2.107.3
# If run for the first time, just run the following to get the admin
# password once it has finished starting
docker exec jenkins-production bash -c 'cat $JENKINS_HOME/secrets/initialAdminPassword'

How to upgrade your instance to a more recent version

Using Docker, upgrading should always just be a matter of using a more recent version of the image.

Jenkins follows this pattern, so if I want to upgrade to latest Jenkins LTS to date, 2.121.3, all I have to do is the following. You will notice that we do use the exact same command as above, but we’ve just updated the version to the one we want to upgrade to:

Upgrading to latest Jenkins LTS
docker stop jenkins-production
docker rm jenkins-production # just temporarily docker rename it instead if that makes you worried
docker run --name jenkins-production \
           --detach \
           -p 50000:50000 \
           -p 8080:8080 \
           -v jenkins-data:/var/jenkins_home \
           jenkins/jenkins:2.121.3

Done.


1 a bind-mount is the term used when one mounts an existing directory inside a Docker container. To provide a simple way for you to know if you are using it, you are using a bind-mount if the first parameter after -v starts with a /. Though bind-mounts looks simple hence appealing at first sight, they are not simple at all. You want to get away from it if you wish to avoid getting all sorts of annoying permission issues when you were thinking everything is going fine

Do not run your tests in Continuous Integration with the root user

I was working recently on diagnosing unexpected tests failures that were only happening on our brand new Jenkins environment, and weren’t happening on the previous one. To provide some context, we now run a majority of things in one shot Docker containers, and that helped reveal an interested issue.

The offending code

We have a test to check our code behaviour if a file we need to backup is readable or not. It was roughly like the following:

// given
final File demo = File.createTempFile("demo", "");
FileOutputStream fos = new FileOutputStream(demo);
fos.write("saaluuuut nounou".getBytes());
fos.close();

// when
demo.setReadable(false);

// then (try to back it up, it should fail)
byte[] read = new byte[10];

new FileInputStream(demo).read(read);
System.out.println("Can I happily read that file? " + new String(read));

And weirdly enough, this test was failing. By that I mean, there was no failure for the code above… [1]

The reason

We were running those tests on a shiny new infrastructure, and using wrong Docker images using root user as the default. For instance, if you use a base image like openjdk or basically any base image, you will hit this issue.

The thing is, when you are root, a bunch of things is not true anymore… For instance, permissions…

If you don’t read Java, here’s a shell port of the Java code above:

$ echo hello > demo
$ chmod a-r demo
$ cat demo
cat: demo: Permission denied

But then replace the cat above by sudo cat:

$ sudo cat demo
hello

I for one was slightly surprised root does not honor permissions at all. Had I been given a quiz about this, I would probably have thought that being root would still prevent you from reading it (but being root would allow you to call again chmod at will to set what you need), but that’s how it is.

Note
Most of the Docker base images run the root user by default. This is often for good reason: you are likely to use openjdk:8 for instance and need to install additional things. But you must go the extra mile and switch to a normal user, using the USER instruction (either after having created one, or using a present one like nobody or something that suits your needs).

But running as root in a Docker container is OK right?

There has been articles out there explaining better than me why it’s not. Reducing attack surface, etc.

In my opinion, I hope this article shows this is clearly not the case, even for things like Continuous Integration/testing where one may think this is a special situation, hence acceptable exception.

Some people might argue that this is not the same situation anymore with the advent of the user namespace. I will answer that though this is definitely a huge improvement, this does not change anything to the statement above.

Indeed, you will still be root in the container, and your code will NOT fail as it should for that kind of case (another example if need be: you would be allowed to use ports < 1024, when you should not). And in the case of CI, you take the risk to miss corner cases because your CI environment will not be as close as possible to the production one. And for pretty obvious reasons, well you want your tests to be run in something close to the production…

Conclusion

I think we can say it is a very common and accepted practice that running a server using the root user is a bad idea. It is the same thing in Docker, for many reasons, and hopefully the examples given above will confirm it. At least it was a lesson for me, and I’ll be very cautious about it from now on.

So, if you care about your tests, and their ability to monitor and reveal issues and regressions, do NOT run your CI with the root user.


1 If you don’t read Java, in that code sample I put some text in a file, remove the read permission on it, then try to read it again. The expected behaviour is that it should fail (Permission Denied). In real life, we have that test to assert our error message is understandable by humans in that situation :-).

How to connect to a Windows AWS instance from Linux

Working on validating the new Java 8 baseline of Jenkins, I needed some Windows environment to analyze some tests failure only happening there. So, I went ahead to create an instance in Amazon Web Services to connect it to a Jenkins test instance.

It is actually pretty simple, but I thought I would explain it quickly here because it might help some people (like me next time) to save some minutes.

Launching your instance

It is out of scope here as it has nothing specific to our current use case. Just create an instance. I am using the Microsoft Windows Server 2016 Base - ami-45e3ec52 for that article.

Caution
The only important thing is to make sure to keep the selected .pem file handy. It will be necessary to retrieve the Administrator password once the instance is created.

Your instance is now running.

Connect to it

  • In the UI, right-click on the instance, then click Connect.

connect to instance
  • Copy the Public DNS field value

  • Open a command-line and connect using the rdesktop command:

$ rdesktop ec2-54-152-45-128.compute-1.amazonaws.com
Autoselected keyboard map fr
Connection established using SSL.
WARNING: Remote desktop does not support colour depth 24; falling back to 16

This should quite slowly finally show the following screen:

windows login
  • On the AWS Connect To Your Instance dialog open previously, click on the Get Password button.

  • Select the .pem file discussed above. pem-selected

  • and click Decrypt Password. You should now see the plain-text value of the password to connect to the instance. decrypted

Note
I don’t know if this is specific to my environment, but if you’re lucky enough like me, copy-paste will not work. So I had to manually copy paste the value… Cumbersome with that password length.
  • Type Administrator and the password above on the logon screen. connected

Hope this helps!

Page 1 of 33 Older