Thinking about a more efficient keyboard layout other than qwerty

Likely you are using a keyboard with the letters ‘q’ , ‘w’, ‘e’, ‘r’ on the top left, which is also commonly being referred to as the qwerty layout. This layout was created in 1870s. You may have asked the same question as everyone first started learning typing “why A is not next to B, why the layout looks like total random”. The answer to it was the inventor back then claims this randomness actually helps improve the typing speed by distributing the tasks equally to different fingers, and also due to some mechanical designs limitation that jamming too many frequently typed keys together is not a good idea.

Nowadays, many of those premises no longer holds true and constraint also being removed. Now you can literally make the keys as small as you want, or concentrate certain keys as close as you want without worrying about physical limitations. At the same time, the content that end users entered has also shifted so much that certain keystrokes got typed more frequently than 150 years ago. For example, in the modern digital world, the symbols got typed much more often and sometimes dramatically exceeded even certain letters. For a programmer, they use symbols like semicolon as frequently as how writers use period because semicolon actually means end of a statement in several programming languages. In a typical tweet, people have to use characters like “@” and “#” maybe without using any traditional punctuations.

There has been some efforts to adjust the keyboard layouts by adding certain keys like multimedia keys but the qwerty has certainly never been challenged to a degree that the society cares enough to make a change. Maybe because it is not a problem at all. I agree that for those who work extra hours they certainly not think because they cannot type fast enough. And maybe it is just such a prevalent layout that the cost of change and inconvenience that it will cause outweigh the benefit (many softwares like video games, editors actually choose certain shortcuts because they were convenient under qwerty).

The question now becomes, can you quantify the efficiency or even prove that for a different layout, there will be a significant efficiency gain, or if we take it one step further, out of all the different layouts, is there a way that we can find the most optimal layout that worth the effort to ditch what everyone has been using for the past 150 years.

Key Performance Indicator

There is a saying “if you cannot measure, you cannot manage”, so let’s first start by coming up with a way of measuring efficiency. There are common metrics like (word per minute – WPM) that people use to brag to their friends how fast a typer they are. If we introduce a different layout to two groups of people who are not biased (maybe students or kids who are just learning typing). That will be a great experiment. However, just like covid vaccine test, in order to be scientific, we will need to conduct experiments and the cost of that is just too prohibitive at the early selection stage. But sometime down the road, it certainly will be required.

Now let’s think about the typing process, it boils down to a sequence of actions about moving fingers fast and accurately to push down certain button. There has already been tremendous amount of conversations around what kind of keys are the easier to type, then there comes along inventions like mechanical keyboard, what kind of cherry switch is the most suitable, clicky versus linear etc. However, it is quite obvious that time spent is not only about pushing the button, but also moving the fingers to the right position. Think about how easy it is to type “asdf” because it is literally all next to each other. Now try to type “?\=`”, you will need to lift your pinkie finger to the corner of the keyboard and move it around even with the assistance from your eyes too.

So maybe one angle when we are looking for the best key layout is actually to seek for one that that the user to not move their fingers that much.

For those of you who has an engineering background, this probably already sounds similar to an optimization problem and some of you might already start thinking about “shortest distance”, “traveling salesman”. Yeah, there are many mature algorithms which is designed specifically to solve similar problems and find the shortest distance.

The question is now “can we find a keyboard layout that yields shorter traveling distance than ‘qwerty’ or even better, can we find the best layout that guarantees the shortest travelling distance given modern typing task”.

There are many assumptions that we are going to make in this exploration process and those assumptions might be too naive which lead us to flawed conclusion, but they will dramatically help simplify the problem at the beginning and we can challenge them later and build more complexities as we want.

Key Positions and Size

In a typical keyboard, most keys are same size (square or round) key caps that are perfectly aligned horizontally and tiled vertically. Certain keys even come with a different size like space bar being the biggest, left and right shift, enter, etc. For now, let’s assume that all keys have the same size and are positioning perfectly on a meshgrid, like a chess board. In that way, it is much easier to calculate any distance for two given keys without worrying about the specifics. Another benefit is that the keys can be stored in certain data structure and being calculated in an efficient way like ndarrays when it comes to computing.


Each key on your keyboard usually carries two meanings like you can change the case for letters and entering a different symbol by holding the shift key. This is certainly an important topic as it is those symbols that previously got assigned to the corners or even share the same physical switch that grows its usage lately. If there is a need to separate out certain characters, or even promote certain character to be at a prominent positions are certainly changes that worth exploring. In that case, we will treat all printable characters on a keyboard as its own entity with several exceptions.

Maybe for numeric characters, it worth the effort to exclude them for now as regardless of which frequency to use, they gains its benefit of staying in the numerical order regardless of its usage. At the same time, the case for letters might keep sharing the same key as I can totally imagine what kind of criticism that we will receive if a share the same key as upper case Z. What the hell? There might certainly relationship in symbols like pairs of brackets, smaller than or greater than. In reality, there are even autocomplete features that very few developers have to type the closing parenthesis or brackets because they come autocompleted by IDE. In that case, maybe temporarily removing those closing symbols also makes sense.

So now this is what we ended up with 58 characters:

26 letters
32 symbols: ~!@#$%^&*()_+`-=[]\;’,./{}|:”<>?

Distance Definition

This is probably the most important part of this analysis as it directly determines how our final score will get calculated.

Simple Travel Distance

A naive scenario will be treating all different keys as physical locations. Calculating the total traveling distance for typing a sentence will be equivalent to calculating the total distance of a taxi trip which we just need to sum up the distance between subsequent characters in this physical space.

In order to type the word “hello”, it includes 5 keystrokes and 4 inter button movements. The totally traveling distance will be the sum of moving D(h, e), D(e, l), D(l, l) and D(l, o). One might say that D(l,l) requires no travel at all as it is just pushing the same button twice.

In our case, we can easily calculate the Euclidean distance between e and h as their horizontal difference is 3 and vertical difference being 1. So based on the Pythagorean theorem that the direct distance will be calculated as sqrt(3^2 + 1^2) = 3.16 = D(e,h). In this way, we can calculate the distances being

D(h,e) = sqrt(3^2 + 1^2) = 3.16
D(e,l) = sqrt(6^2 + 1^2) = 6.08
D(l,l) = 0
D(l,0) = sqrt(1) = 1
=> D(hello) = 3.16 + 6.08 + 1 = 10.24

Well, pretty straightforward right? so let’s see if we can write a small Python script so it can be calculated easily.

After we populated the keyboard layout for qwerty and extract the coordinates, now we can write a function that takes in any given layout, any given text, and then we will calculate the total traveling distance. (note: the np.linalg.norm is a pretty handy and high performing function to calculate the Euclidean distance between coordinates)

It matches our calculation!

Different Layouts in Action

Now we have some basic functions to help us evaluate performance. Why don’t we feed it some real life layouts and content and see how it works in action.

So far, other than qwerty, there are indeed some alternatives like coleman or dvorak. Let’s create those two layouts and compare it with qwerty.

In order to work with some real life content, there are some small adjustments that I made to the code which I previously mentioned to skip through travels which involves the characters that we have not entered yet. And also created a counter so the returned travel distance is normalized so we can compare it across even different content or text length.

Here we grabbed the Shakespeare’s sonnets from the Gutenberg project and it is interesting to see that our popular qwerty outperforms other mainstream layouts by quite a bit.

However, now this prompts a few other questions:

  1. is there any other layout that we can generate that proves to outperform qwerty. In theory, we can generate all the possible permutations of layouts and calculate the travel distance for each and find the best one, is it computational feasible, is there a good algorithm that we can use, if not, will there be any approximation technique that we can deploy?
  2. here we are assuming that we actually have only 1 finger. In reality, there are regions and the distance should be calculated in a different way that letter “f” and “j” are really fast traveling wise as both of your index fingers are resting there.
  3. we should also try to work with some different content just to play with as there might be some adjustments depending on your profession that certain keys might need to be adjusted

Hopefully this helps you understand the keyboard is actually a fascinating topic and I will try to cover some of the pending questions in more depth in future writings.

An algorithm question – shortest travelling distance

I have a interesting question that is related to calculating the shortest distance. However, rather than choosing a path for any given graph (vertices and edges), in this problem, the path is given and it requires generating a graph under certain constraint.

So let’s go through this problem in a bit more detail:


There is a fixed amount of positions (N) into which we need to place N different objects. The traveling distance between all the positions are pre-determined and fixed. The task includes a sequence of objects to pick in an order that cannot be changed and the same objects can be picked multiple times. The question is how to place those objects in an order so that the total traveling distance can be minimized.


Now, let’s give an example.


Assuming that we have 4 positions in a 1D dimension (list). And we can easily calculate the pair wise distance between those positions.

P1 012
P2  01
P3   0


Now let’s go through our task. Our task might be any sequence of any length of different combinations of 4 objects. For example, it could be 1232213. It means first to pick up object1, and then pick up object2, then 3 so on and so forth. To calculate the total distance, it is as simple as adding up the distance between each objects. (for the first element and last element, we can simply by assuming a situation which all picks needs to originates from a starting point and ending point that doesn’t belong to any of these positions, but they take equal amount of travel to go to all the positions). In that case, our total distance will be:

Total Distance (1212103) = Const + D(1,2) + D(2,1) + D(1,2) + D(2,1) + D(1,0) + D(0,3) + Const

Distance is a function of positions and to pick objects, objects will be mapped to the position.

Distance between two objects x, y = D(p(x), p(y)) where p is one mapping


Keep in mind that the numbers above are not indices to positions, they are objects. But if we put all the objects in the same order as the positions, the calculation is as straight forward as:

Total Distance (1212103) = Const + D(1,2) + D(2,1) + D(1,2) + D(2,1) + D(1,0) + D(0,3) + Const
= … + 1 + 1 + 1 + 1 + 1 + 3 … = 8

Simple right? in this case, for this type of positioning, we achieve a total cost of 10.


Now let’s try a different positioning, let’s swap the position of object0 and object1.

 P0 (O1)P1(O0)P2 (O2)P3 (O3)
P0 (O1)0123
P1 (O0) 012
P2 (O2)  01
P3 (O3)   0

Total Distance (1212103) = Const + D(1,2) + D(2,1) + D(1,2) + D(2,1) + D(1,0) + D(0,3) + Cons
= D(0,2) + D(2,0) + D(0,2) + D(2,0) + D(0,1) + D(1,3)
= 2 + 2 + 2 + 2 + 1 + 2 = 11

So far, we have two different positioning and each yields a different total traveling distance. And the default one is considered better than the second one.

The goal is to find the most optimized positioning that yields the shortest total distance.

Bruce Force

Here is a small demo of the bruce force approach given 4 objects at 4 positions. There are in total 4! = 24 different permutations and there are two positioning that yield that shortest distance. Luckily, one of the examples that we covered above happens to the the most optimal solution.

I have thought about some of the previous models that I have worked with but none seems to be a good match right off the bat. For example, the Dijkstra is to find the shortest path or sequence given the graph, the Knapsack’s is to find the subset of the goods to maximize the weight.

Now the question is can we find an existing algorithm that solve this problem in an efficient way, if so, how efficient?

A reflection of my “work from home” days in 2020 – Part I

Honestly, I have never been a work-from-home person, at least not used to. When I was in school, I was the kind of student that only studies in the library or the classroom, very rarely in the dorm. When I started working and even after being a home owner, I am still the kind of person who come by the office during the weekend just because I feel more “productive”. Everyone faces different kinds of challenges and live in a different way due to the pandemic, and the biggest change to me is work from home.

As the pandemic is getting worse, I realized that we might be there for a long run, or frankly speaking, this might just be a permanent change that industries need to switch to. I started to accept and think about ways to reflect and consider options to improve my work productivity and personal comfort level and health.

There is a post where I will list some of the changes I observed and adopted and want to share this my readers.


Probably the most important part of working from home to me, is to establish a routine, one that suits you the best, your family/roomate the best and your team the best.

For me, I started by applying the same routine as if I was going to the office and then slowly came to the following schedule after trial and error:

  • Two meals a day: 8:30AM and 6:00PM: this is probably the biggest change to me as I used to have regular 3 meals a day and dedicated workout hours almost at daily level. During the pandemic, I changed and workout time become a luxury, frankly speaking only limited to light events like jogging, walking and some basic stretching. Having 3 meals turns out to be time and mentally consuming, inconvenient. I don’t feel the craving nor the happiness as I used to. Then I cut out my lunch and replace with snacks. To everyone’s surprise, it turned out to work great for me. For almost a year, my body weight has increased from 75kg to 80kg but most of the gain was at the first month or two, then it has been pretty stable since then.
  • Healthy food: “marry well”. I cannot over emphasize the importance of the quality of the food. Luckily, my lovely wife has been helping out preparing the food and meals and she is so well organized with usually a full weeks plan of what to eat with a great variety of diffrent kinds of food. All the raw ingredients have been ordered from Amazon Fresh. Our diet mainly contains high protein food like chicken, salmon, eggs, very little red meat and heavy in fruits, low fat and fiber.
  • unhealthy food: it might just be me, but sometimes I just become so craving for the “wrong” type of food, and my wife told me it is so visible that looks like I almost have my “menstruls cycle” for certain food. Later on, we agreed on having maximum two meals a week for restaurant take out (Pork Belly from Chinese food, or just classic Medium Pizza from Domino, hot dog or whatever the hell I want)
  • 4000 steps a day: “long sitting is dangerous”, regardless of how comfortable your work station is, getting out and have some outdoor time is essential to me. I tend to start having lower back pain and higher level of intensity and anxiety if I sit too long, like long hours meeting, coding, or gaming. Avoiding big blocks of screen time by even taking a 10 minute break makes a huge difference.
  • sleeping: I do have to acknowledge that I have one of the most comfortable bed and mattress and I am proud of that decision when we first got our place. I tend to stay up late and thought that “I am taking advantage” by staying 30 minutes extra late watching youtube videos but we all know those hours are just biological debt that you owe and need to pay back one way or the other. I tried to guarantee at least 7~8 hours of quality sleep. And sometimes a 30 minutes to 1 hour nap really really makes a difference if you simply don’t know what is wrong but hell as sure something is wrong.
  • dog: these days are the best time to have a pet if you don’t already have one, if so, maybe get two :). My wife and I were dogsitting neightbour’s corgi for a long weekend and it turned out we had so much fun. Afterwards, both of us agreed that having a dog is a responsibility that we will be able to fulfill and an experience that we both want to have. Having being a dog owner for 3 month. I don’t regret the decision a bit, neither of us. An interesting observation I made is now there is someone else in the room that you can blame to when your partner started being unreasonable. You can start making up execuses like “puppy doesn’t like what you just said” and by having a common goal, this made my wife and I bound better and act as a responsible couple. There are so many to just being a first time dog owner. However, keep in mind, dogs are the best friends.

Home Office Setup

Now, let’s move on to the physicals, there are so many things that a real corporate office can offer you even unconsciously, the face time with your team members, the great chichat with the funny guy on your group, etc, or even not realizing how good the soda drinks your office did offer.

Here I will try to list several things that I purchased which greatly improved my home office experience.

Standing Desk: There are times that you just cannot be away from your keyboard, back to back meetings that requires your full attention. Remember, long time sitting is as harmful as smoking. I know that there is no medical evidence proving that having a standing desk has positive effect on your health but it makes a difference to me. By sometimes lifting and shifting the workflow for one hour. I relaxed my hip, my shoulder, and neck so much by unconsciously stretching, changing my pose, relaxing my wrists or just overall more freedom than what you can do on a chair. Also a great opportunity to have your butt breathe the air a bit, there are many medical reasons to it, my friend. Previously, I have got myself a cheap standing desk, or more like a laptop stand, the challenge is that it is very unstable and feels like a wabble head that you have to carefully type just so your desk is not shaking. Even with the expensive standing desk that I have, it is still not as solid as a real desk, but for light typing and conference calls, it works great. If you are doing some intensive typing like documenting or blogging, standing desk overall might not be the greatest experince, or there must be a higher price that you need to pay.

Monitor: I used to have a decent Samsung 27 inch as the extended monitor for my laptop. It used to, or at least I thought it was working great, until I changed later this year. During the holiday season, I just couldn’t resist the temptation to purchase some devices and after some research, I made my mind to purchase a fancy 34 inch ultrawide gaming monitor. Honestly, it is probably the best purchase decision that I made this year. It is different, very different, and that difference can feel uncomfortable at the beginning but once you get used to it, it is just so much better. You not only have more screen real estate, but by having a single pane of glass without two monitors is just an experience you have to try before you judge. I switched from Mac to Windows also this year. In windows, there are some basic features where you can snap windows, multiple virtual desktops, adjust the split between windows etc. I personally feel this being my biggest productivity boost so as the comfort level. The video game that I happen to play DOTA2 also support ultrawide screen and many applications, youtube videos now is just a pure net new experience a regular 27 can never offer you.

Headset: I bought myself a Bose Quiet Comfort35 due to the annual big discount on Amazon. It might saved only one less cable on my desk but improved the cleanness by a lot. Now you can comfortably swing by the kitchen or even the bathroom without losing the context or flow of your meeting because of the wireless. To me, I have a balcony and sometimes, I even put on my headset and just walk doors to relief my dog while still being able to work effectively. Also, as a resident in Colorado, having a headsets comparing with earpod is that yours ears will thank you for that warm cover.

Keyboard: Full confession, I bought several keyboards this year, maybe too many. I got myself a Logitech Mx Key, KeychronV2 red and KeychronV1 Low Profile brown. Bottom line, all of them are very different and you start to realize certain keyboards work best in certain scanerios. In the end, I nowadays use the MX keys most for day to day for regualr typing like work, coding and others and my cheap REDDRAGON blue switches for gaming. Every now and then, I swtich between all these keyboards just to get some fresh feeling.

Mouse: I replace my previously wired Razer Deathadder Elite with Razer Mamba wireless. Don’t take me wrong, the previous mouse works great for several years, the only reason I made the swtich was because of the cleaness because of the wireless. The mouse feel virtually the same to me but the experience is wireless is just great. The standing desk that I have

Docking Hub: I brought home my DELL working hub from work which works really well with my windows laptop. I have two PC laptops and sometimes hooking up all the peripherals are the most annoying experience if you do it on a daily level, or multiple times a day. This hub is a pretty solid as it can offer sufficient power of not only directly powering all your devices and laptop at the same time, but at he same time, it makes the switching to be one cable only and all the peripherals are directly connected to the hub itself. It also keeps most of the connections a bit far away from you so you don’t have all the wires and devices all tangled together right in front of you.

Writing Board: A very important experience missing out in the WFH days is the “white board experience” that you can no longer doodle with your colleagues and brainstorm for solutions and technical ideas. Yes, you can share screen and all kinds of stuff but have you tried to use your mouse to write anything? it is very hard. I bought myself a basic version of writing board XP-pen start G6740. I can easily plug it in and start white boarding by using either the built in app of windows or use the draw feature in Powerpoint to have productive and quality writing experience that others won’t. The only thing that I might regret slightly is it is still wired and you need to plug it in, but given the infrequent usage and the cheap price, I can live with it.

Cable Organizer: Packaging and modularizing is a very important concept in software developer, when things become very complex and you can gain benefit and simplicity by abstracting one level above. As I started to have more cables, charging stations, I purchase myself a cable organizer just to bury the cables. It looks certainly cleaner and that is all I need.

Here is a list of a devices that I have mentioned in my home setup in case you are curious.

The pandemic is not easy, but as long as we can all have faith, an open mind for change, adjust and adapt, we will come out of the other side finally, strong and better. Hopefully this blog post is helpful to you, if so, leave in the comment what other things you want to learn about my workflows and all the links at the end are affiliated with Amazon, by purchasing using the link I provided might help me explore more options which I can share later. Thanks for your reading and happy working from home.

Python concurrent.futures – The Basics

Concurrent.futures is a module that started being included in Python since the release of version 3.2. That being said, in most of the Python 3 versions, like 3.5, 3.6 or newer, it is a standard library just like os and system that is readily available.

It has the benefit of “democratizing” parallel processing and the developers no longer have to understand, or write tens of lines of code for the basic setups that usually comes with multithreading or multiprocessing.

Without further ado, let’s start with an example by simulating how much time could be saved.


Here we have a very basic function, it takes some input, do something and return a result.

In this case, to simulate the time saving, we intentionally let the function sleep for i seconds to mimic the behavior of delay due to computing, network, io, etc. Also for analysis purpose, we captured the thread_id, the time the function got started, ended, the input and output (a simple square).

Baseline – Synchronous

Let’s start with a base line where we randomly generate a list of tasks and loop through tasks one by one.

In this test, we generated 5 tasks with different delays, and the total task should take 9 seconds to complete. By executing them sequentially, we captured the output and also displayed in the form of a pandas dataframe. As expected, we have 5 rows that corresponds to the 5 tasks. And all 5 tasks were executed by the same thread (23440) and the total time difference, the delta between when the first task get started and the last one got finished is 9 seconds.

This is very typical for most processing tasks, many tasks that are similar but are fairly independent, maybe each of the tasks is to read in some files and process it, or take a url and collect the HTML and transform it.

In this visualization, we can see that all tasks are handled by one worker and the word is done sequentially or synchronously.


This is a very basic yet powerful script to show how to use concurrent.futures. First, we start by creating an executor that we are currently using ThreadPoolExecutor which you can also use ProcessPoolExecutor if you want. Rule of thumb is to use multithread when it is io blocked but use multiprocessing when it is cpu bound.

Here, we actually specify the worker or the number of thread to be 1 just to get the syntax working first. Here we are calling a method of executor called map which does all the heavy lifting. It actually took the task list and IMMEDIATELY distribute the task to the executor.

As you can tell, the code works, it does exactly what we did above and the execution plan looks virtually the same as the baseline.

Now let’s modify the number of workers to be 2. There is where we start seeing parallel processing working in action. The moment one worker took on the 3 seconds task. A different worker took on a different task of 1 sec. And after that task got finished, it acknowledges and took on another one, so on and so forth.

The total execution time now is only 5 seconds which cut the total execution time in about half (9 seconds). An interesting observation is that theoretically, you can further improve the total execution time by having two separate workers handle the 3 seconds and 2 seconds and it should cut the total time to 4, but that is another problem.

You can read more about the map function either from the documentation or the source code directly.

Of course, to see the full power of concurrent.futures, the total execution time can be minimized to 3 seconds when we provide enough workers.


So what is “submit”? IMHO, one will use submit more often than map in real life so this is probably the function that you should familiarize yourself more. By looking into the source code of map function, it is actually calling submit behind the scene.

It is actually very simple to use, in addition, you can pass on *args and **kwargs which gives you more flexibility to it.

The key difference between map and submit is that calling with map will retain the order when accessing the futures while calling with submit + as_completed will loop through the objects as they were being yielded.

Quote/Counterquote: “Life is like a box of chocolates” (or not) – the  Celebrity Edition...


Speaking of the number of workers, the obvious answer is the more, the merrier. However, for any given machine, the computing resources are usually limited, the number of CPUs and the number of threads available.

In this case, I have created 500 tasks and requested 500 workers and they seem to work fine.

In the threadpoolexecutor, if you don’t specify max_workers, there will be some default values being applied which you can find more information from the documentation.

For example, in the latest Python 3.8 as indicated in the documentation. os.cpu_count() return 12 from my DELL XPS, so the default max_workers will be min(32, 12 + 4) = 16 workers. Based on some reading, people say there are limited amount of physical threads and the introduction of virtual threads can virtually be as large as you want, however, when it approaches the limitations of physical threads, there won’t be any marginal benefits as you increase the number of threads / max_workers.


In this blog post, we have covered several basic usages of concurrent.futures.ThreadPoolExecutor and you should be ready to save some execution time by leveraging all those beautiful threads. However, there are more to it rather than simply distributing lots of work.

Some key topics that worth pursuing in the future:

  1. how to collect all the results and persist making sure it is thread safe
  2. when should you use multithreading and when should you use multiprocessing? how do you know that
  3. how to configure timeout and handle the error properly
  4. other aspects of the concurrent.future library

fund manager efficiency

efficiency = skill * breath * implementation

IR = IC * sqrt(N) * TC

There is this great paper talking about a key measurement named information ratio – which is the excess return for a unit of active tracking error [JPMorgan]. Keep in mind that it is different from the Sharpe ratio as the information ratio is related to the excess return and risk with respect to an existing benchmark, so instead of the absolute volatility, it will be the tracking error.

At a very high level, IC is the information coefficient between -1 and 1 that indicates how good a manager expect to be and turn out to be. The information ratio also grow with the number of independent decisions and the last item, transfer coefficient means the level of constraint.

If you are particular interested in information coefficient on its own, this is an video from Quantopian teaching you how it is being measured from the practical point of view.

Shift, Twist & Curvature to explain all yield curve changes

Yield curve change is referring to as time passes, how the different yield changes for different maturity. Rather than calculate the difference for each single maturity, industry has always been seeking a way to simplify the and find a simple explanation. And the three key components behind any yield curve change is the famous shift, twist and curvature.

Just like you can use x, and y to explain all the positions on a 2D space, you can use shift twist and curvature to explain all the yield curve changes. These three components are also constructed as uncorrelated factors.

There is this great youtube video that further explained how the three components are being calculated using Principal Component Analysis.

Yield Forecasting and Principal Component Analysis

Once we understand that three components of changes, the next step is usually to prepare the actions in anticipation of changes in each component. For example, if you have three different options of structuring your portfolio as laddered, barbell or bullet.

Then you can use them to list all the possible combinations of three factors, and calculate the potential outcomes.

You can weight the different scenarios differently based on your own forecast of the yield curve and use the probability weighted average as the expectation to pick the best action item.

Exhibit 76, and Exhibit 77 are screenshots from the CFA Curriculum and copyright reserved to them.

Riding/Rolling Down the Yield Curve

Yield curve is the a graph that demonstrates the relationship between yield (say government bond) versus different maturity. Just like you purchase CD, that they offer 1% interest rate for 1 year, and 3% for 10 year. A yield curve in most of the time is a upward sloping curve.

The fund part is that the yield curve is only a snapshot. It means that as time passes by, the government might issue new bonds with a different interest rate/yield for the same maturity, and this change can happen to short term bond, medium term or even long term bond. And the next snapshot might look very differently. When the shape changes, the bond that you are holding will still have the same future nominal cashflow but the real value of your bonds will have immediate capital gain when the yield drops or suffer losses when the yield raises.

Assuming that the yield shape will stay fairly static, there is a fixed income technique called rolling down the yield curve / ride the yield curve to enhance the return. The return is as high as the coupon rate, but you can achieve capital gain if you buy at a long term but sell as short term.

When you discount the future cashflows that priced via a long term, the market yield for short term bonds tend to be the low, and if you discount using a lower discount rate, the valuation of your bond will be higher.

Here I listed a 10 year bond with a 5% yield that originally priced at par. Assuming that at the end of the every year, we will consider selling it and the yield for the rest of the maturity will be proportionally decreasing. For example, if you sell at end of year one. Your bond become a 9 year bond and will be valued with the yield that is 4.5%, rather than 5, so on and so forth. As you can tell, the present value of the bond varies based on the yield curve and we happen to have a peak point to maximize the capital gain.

In this particular example, the best decision is actually to sell the bond end of year 1 to maximize the annual gain.

Let’s take a look at another yield curve shape. If we change the yield curve to be concave, we can actually find that the annual income is maximize if we sell at year 6.

If the yield curve is convex. This is how it looks like.

Core Capital Estimation with Mortality Tables


Core capital is defined as ” the amount of capital required to fund spending to maintain a given lifestyle, fund goals, and provide adequate reserves for unexpected commitments”. In another way, it is the money “needed” for the rest of the one’s life.

The concept of core capital is pretty important. In estate planning, one will need to clearly understand how much capital you have to pass on to others (family, charity, etc). Just like financial reporting for a company, it is not too difficult to come up with something similar for a family, list the total asset and liabilities. If we factor in the total liability including the future spending (core capital) and also the human capital of all future projected employment income. We will be able to find the “excess capital”. In that case, to accurately quantify the core capital is an important element in estate planning.

In order to calculate that, we will need know people’s life expectancy, spending for each year and take the time value into account to calculate the total present value.

Mortality Table

It is surprisingly predictable at an aggregated level regarding people’s lifetime expectancy. Just like for a group of kids, it is very likely all of them will survive for another 10 years and as certain as almost none will survive another 50 years post 65.

There is a table that listed many of the statistics related to survival of different age called mortality table. By a quick Google search, one can easily find several published mortality table listed by US CDC (United State Center for Disease Control).

The document that I found was categorized into different tables based on demographics, Hispanic, White and Black and the corresponding male and female.

Head of the mortality table
Tail of the Mortality Table

Most of the statistics are fairly easy to understand and you can refer to this report – National Vital Statistics Report for the detailed definition and explanation of each field. (quite a fun read!)

Of course, if we need to calculate the core capital required for a family, then one will need to use the joint probability calculated as subtracting one by the probability that both couple will die.


Given the probability of surviving each year, we will need to estimate the spending needs for each year and adjust it accordingly. For example, there is the risk free rate which we can use, however, we also want to take inflation into consideration so our discount rate is likely going to be the nominal risk free rate minus the inflation rate.

Double Taxation


Countries impose taxes to maintain its operation. Different countries have different tax systems. When cross-border business happens, it become more interesting. In order to for an income to occur, it usually includes two “where”s – one as where the income was originally sourced, aka. where the money was made, and another one as where this successful business owner resides.

There are countries which taxes all incomes incurred within its boarder is claimed to be source/territorial tax system. On the other hand, countries could be tax based on business owners’ residency, for example, it might not tax aliens but at the same time, it will impose taxes on its own citizen’s foreign income – residence tax system.

If you are unlucky, you might have to pay tax twice, maybe because your business happens to qualified as sourced from both countries, maybe because of your duo citizenship, or as your income got taxed when it was sourced in one country, and also a different country where you reside impose another income tax again afterwards – that is called double taxation.

Usually countries work together to mitigate this effect, where the money got sourced usually take the first cut, and then the residence country will provide some tax code to accommodate. In this case, the source country will tax its own fixed amount say 40%. How much extra its own resident should pay will be calculated in three ways: credit, exemption or deduction.

Foreign Tax Credit Provisions

Exempt is the most straightforward as if its resident paid income tax to its source country, then there won’t be any further taxation – simple and straightforward.

Under credit method, one will pay the residence country the amount which residence tax is higher than its source. If the source is 50% and residence is 20% only. No need to pay more tax after you got taxed 50%. However, if the source is 20% while the residence is 50%. One still needs to pay the delta of 30% to its own residence country. One way or the other, the residence country need to maintain the total tax rate as being max(T_source, T_residence).

Under the deduction method, it is just like various types of individual income deduction from medical, car tax. The residence country will only tax on the portion post tax. so it will be T_residence * (1 – T_source). The total tax rate will be T_source + T_residence * (1 – T_source) = T_source + T_residence – T_residence*T_source.

So let’s take a look at an example, if we hold T_source to be a flat 40%.

Under the credit method, the total tax rate will be max(T_residence, 40%) and under the deduction method, the total tax rate will be 40% + T_residence 60%.

Example – Credit better than Deduction

one interesting observation is that the credit method is always more effective for the same amount of source tax rate.

And if we plot the ratio of credit vs the deduction method.

We can see that the credit method is always lower for the tax payer and the marginal benefit is the highest when the residence tax equals to source tax. That being that, when the residence tax is extremely high or extremely low, both methods converges to T_residence which is not a surprise after all.

Tax Loss Harvesting

(all content here are for academic and self-entertaining purpose, check the facts with authority before you can believe it is true and the authors claims no responsibility for misusage or misinterpretation)

One can make money or lose money by investing in stocks. Stock price usually goes up and down, people usually lose money by selling their stocks out of panic during a market downturn which also means missing out following up-turn (aka. mean reversion). There is a saying “it is not a loss until you sell” which tend to urge people don’t sell in the scenarios like this.

However, there is a technique in investments called tax loss harvesting. This technique is meant to take tax into consideration and demonstrated under certain/most scenarios, it is a better decision to realize the loss, and used the tax saved to reinvest to achieve superior performance than “being passive”. This post is meant to go through an example, explain how tax loss harvesting works and highlights some of the underlying assumptions.

If I have a portfolio which is a mixture of company stocks with the total value of $100K at the beginning of 2020. By the end of 2020, we have recognized a $10K gain by trading. Of course, some stocks went up and some went down, assuming that some of the stocks literally dropped the value $10K or even more. And the tax rate is at 20%.

The question is what difference will it make if I sell those “losers” and replace with more promising ones.

The baseline scenario is that if we don’t sell those losers. We will need to pay 20% on those realized gains which is $10K * 20% = $2K taxes.

If we sell those losers with a realized loss of $10K. That completely cancel out the gain so far and the net gain will be 0. No tax for you this year or the tax saving from the baseline is $2K.

Free money?! No.

As one can tell, just by realizing the loss, you immediately eliminate your tax this year but taxes are NOT gone. You are postponing by reducing the cost base for your newly invested replacements for those losers. For example, if those $10K losses were from a $50K initial investment, after you realized the losses and reinvested the $40K left buying new stocks. If in the future, the price goes up and you try to harvest the gain then. You will need to use the $40K as the cost basis rather than $50K if you did not sell it. So basically you still have to pay the tax for the “losses” then. Hence, tax loss harvesting in nature is a type of tax deferral.

Let’s go through an example, if those securities or replacement securities have its value rise to $100K, and we will sell it one way or another by end of next year. In the baseline scenario where we don’t realize the loss at end of year 1, at that time, our gain will be ($100K – $50K) = $50K. And we will need to pay 20% of those gains as taxes which equates to $10K. And if we realized the loss, our gain will be instead be ($100K – $40K) = $60K. And our taxes next year will be $12K.

TaxBaselineRealize Loss

As you can tell, no difference. Let’s parameterize it.

TaxBaselineRealize Loss
Year1T1T1 – L * T
Year2G * T(L + G) * T
TotalT1 + G*TT1 – LT + (L+G)*T = T1+G*T

As you can see, the realized loss item will be cancelled out in the future by the time you sell it anyway so there are only two scenarios that this “extra work” worth doing, a different and lower tax rate in the future, or reinvest from deferred tax.

If you understand the time value of money, a dollar today is not the same as a dollar tomorrow, or a dollar next year. By defering any type of cash outflow in the future will reduce the net present value of the cash outflow. Minimally speaking, it will be the inflation rate but in the investment cases. If there are better ways of leveraging those cashes to reinvest (buying new promising stocks), the difference can be substantial.

So basically, in this very example, you have $2K at your leisure to invest. Any return on those $2K will be net gains. Or in more general, L * T amount of money that you can use. Almost, like the higher amount, the better it is if and only if you can realize extra gains from those.

In the grand scheme, this whole tax loss harvesting might not be substantial. Why? The tax alpha (abnormal return) is basically how much money you can gain out realized tax loss, tax rate and the return out of that. We say L * T * r.

For example, if the loss rate is r_l and the positive is r_g. The total is r_l * T * r_g. This thing can only be substantial under the premises that your loss was somehow huge and then all of a sudden you have a promising investment. In regular days, this rarely happens but at big market downturns, this return can be substantial. r_l being -50% and r_g be +50%. That is 25% with a tax rate 40%. You got 10% return rate just out of tax harvesting loss.

Also, if you do this on a consistent basic, little things add up and you also gain a compounding effect of wealth accumulation by deferring tax payments on a consistent basis (how long it got deferred is now up to how soon you sell it, if you realized the gain this year, reinvest and sold it far far in the future, theoretically, you deferred it for a very long time).

In the end, to summarize, tax loss harvesting is one way to take advantage of the two-sided tax regimes in both most countries to passively allow gains to grow unharvested, but actively realizing losses. Again, this by no means to encourage purse losses, nor even give any level of comfort that loss is good, it is just another way to get some value out of a shitty situation. Just like not to intentionally make make your food go bad, but if it did, why not a make a banana bread out it.