Stock Price History – Kaggle Dataset into SQLite

Seeing the dead end of paying out an API to query all the companies, I decided to give my luck a try. There must be some sites which has the beautiful csv file that I have been looking for somewhere on the internet. Don’t give up!

This post will be a quick documentation of how I found a public dataset about stock prices from Kaggle and most importantly, how to observe and get the data into a clean format in a database for later research.

1. Download

Frankly speaking, there are indeed so many places where you can possibly scrape the data off if you approach it carefully, at the same time, there are also data sets on Quandl / Quantopian where you still have to be a premium user in order to use it. However, after some research, Kaggle – this community where data analysts/developers banging their heads against difficult machine learning problems indeed has the solution for me.


They a datasets repositary where some really cool data were published in public. After a quick search, you can find several datasets related to equity prices and some even with the financial performance for those companies, the fundamentals, that we can play with later, for now, our focus will be the “Huge Stock Market Dataset”


2. Extraction

The data has a decent size and I will kindly warn those windows users who uses the default compression/decompression program, it will be slow for you. I have a pretty old HP desktop and it was decompressing the file at a ~1MB/s speed, that will take me tons of time. I highly recommend 7zip which is a free archive application that can totally deals with commonly use compression format. And for me, it was 5 times speed time.


3. Format

First, let’s take a quick look at the dataset. The uncompressed format is about ~770MB that has 8500 files. It is categorized into two folders, the ETF and the Stock:


The data is structured in such a way where each symbol/ticker is a individual text file on its own, and all following the format of format.

Let’s take a look at Apple’s data file to understand the file structure.


It looks like a pretty classic CSV (common separated file) contains the daily prices since 1984-09-07. It indeed goes back a long time but Apple issued its IPO on December 1980 so I don’t think this dataset contains all the history. Another quick check is to understand if the stock price has been adjusted, in a way where whenever there is a stock merge/split, the price is baselined or normalized for analysis purpose. If not, our analysis might take the risk of reaching to the conclusion where the stock price dropped by 50% which in fact, it is merely a 2-1 split.

By visiting Apple’s website, you know they have issued stock split 4 times, 1 time for a 7-1 split and the rest is 2-1.


So theoretically, one stock at IPO is now equivalent to 1 * 2 * 2 * 2 * 7 = 56 stocks of today. I came across a blog post from Maria Langer and the story that she shared how she her stocks grew since 1997 is totally interesting and inspiring. In the end, I did find a picture of a 1998 Apple stock certificate to show you how expense those stocks could be today if there was not stock split.


This certificate was issued at Apr, 30, 1998. And there are there has been three split (2*2*7=28) since then. By the market close this Friday, each stock is ~ $165. So if there has never been stock split, you need will need a lump sum of $4620 to just buy one Apple stock. That will totally change the demographics of the investor for Apple, probably only high net wealth individual or institutions will be able to invest, much less liquidity and probably won’t be as successfully as it is today as a house hold name.

Anyhow, like Yahoo finance, its pricing data is adjusted in a way taken stock split into consideration.


The Apple was IPOed at $22 per share. And in Yahoo Finance, the Dec 1980 price was $0.51, which aligns with the stock split. ($0.51 * 56=$28 ~ $22). People might say “should have I invested $XXX, I would have $YYY today”, the short answer is even if you were an investor at that time, 1980s, it was actually very difficult to see companies like Apple to be a good company to invest.

All those hyper growth looks exciting but let’s compare it with the interest rate. For example, the Fed Interest rate in 1980 was 17.26%. By the time this blog was written, the FED rate is only between 2~3%. If the risk free rate was that high, I really couldn’t imagine how could anyone take the risk and invest their savings into a tech startup with the their CEO dress like college students.

To prove my point, you can pull the FED rate and the risk free holding period return is 523% if you buy T-bill.

That is a mouthful and enough distraction, let’s get back to see if our dataset actually contains the adjusted price. Clearly, the starting price is 42 cents which is far less than $22 in 1984. It is a good indicator that the data downloaded is adjusted.

 4. ETL – Database

Even if the data is already in text format and on your disk, my personal preference is to convert that into a format that is easier to deal with like to put into a database. For now, let’s dump it into SQLite. Then, it will be pretty easy to do some analytics or connect with other tools like Python and visualizations tools more easily.


By using Pandas and SQLalchemy, the life now is so easy. Since this conversion requires a lot of disk read and write, it took me a while, about half an hour, so it is a good idea to add in a progress bar and try except logic.

In the end, we ended up with 32 companies somehow got empty file in the txt file which are

['accp', 'amrh', 'amrhw', 'asns', 'bbrx', 'bolt', 'boxl', 'bxg', 'ehr', 'fmax', 'gnst', 'hayu', 'jt', 'mapi', 'molc', 'otg', 'pbio', 'pxus', 'rbio', 'sail', 'sbt', 'scci', 'scph', 'send', 'sfix', 'srva', 'stnl', 'vist', 'vmet', 'wnfm', 'wspt', 'znwaa']


I took a quick look at the Yahoo finance and they do look legit companies some with good history of data, but I guess we will put a pin the question of why they are missing data and focus on the ones that we have.


After all of this, 17 million records for 8507 different public companies (a count distinct took 45 seconds without indexing so be cautious when you play with complex queries) and database is about 1.3 GB.

In the next post, we will do some descriptive analytics and hopefully figure out an efficient way of manipulating the data.







stock price for Nasdaq listed companies – Alpha Vantage – “Free”mium

If you are interested in playing with time series data like stock price, it is usually a good idea to start with Finance, probably the most frequently used exchange stock price. There are many exchanges out there and NASDAQ is a good one. I shopped around on the Internet but it is a bit hard to find some good dataset with fine grained data without paying. However, there are plenty of free APIs out there but they are all based on tickers so in this case, we can put together a solution where we can first get a list of public company names and then loop through each company but making an API call each specifying the time range.

1. Get Company List

By visiting Nasdaq website, you can easily find a download file which contains all the tickers that listed there (not only Nasdaq but also Amex and NYSE). nasdaq_company_list

And this is how the data file looks like.


This is the first time I ever see 3435 public companies listed in such a clean format, let’s do some quick analysis. Since the industry is a subcategory of Sector and have if not hundreds, at least tens of different categories that might be difficult to display. For now, let’s aggregate by sector and see what are the total market cap, number of companies, and maybe how “young” each sector is by calculating the median IPO year.


As you can see, technology sector has the most market cap (5.9trillion usd) which is almost half (46%) of the whole market. And the whole Nasdaq total market share is about ~ 13 trillion USD. At the same time, it is interesting to find that it is actually the finance industry who has the highest average IPO year and not surprisingly, consumer durables have the lowest/oldest average IPO year. From the company count perspective, Health care has the most number of public companies.

Anyway, now we have a trustworthy list of tickers, the next step is to hit the API and get the time series stock price for those companies via alpha vantage.

2. Get Time Series data

I put together this little program so that I can make calls and then store the raw response to my local disk for later processing. Sqlite is a good option and you can use dbbrowser to view the table content easily.


One small tip is that the insert statement above is an easy way to escape all the characters by using the question mark placeholder. Quite neat so that you don’t have to play with double quotes and single quotes, which is a big pain in the ass.

Unfortunately, my job couldn’t finish even for my tests against just the first 10 companies, and I should have guess it way ahead of time, it is a “free-mium” service, the API has an extremely small limit which if you are trying to make more than 5 calls in a minute, you need to upgrade to the premium services, which I am not fully ready to do that yet. I guess this is the end of post. A perfect example how difficult and time consuming it could be to hunt down the good data sources.


Frankly speaking, it is indeed not that expensive but I guess as a hobbyist, you probably want to shop around and see if there is a better choice for your weekend project.



Fixed Loan Monthly Payment Calculation

Usually people come across various forms of loans in their lifetime, which the mostly commonly seen in the USA are fixed rate loan for automobiles or houses. There is usually a fixed payment each period and by the end of the loan, all payments including interests and principals will be paid in full. However, how is the monthly payment calculated?

If you got a quote from the lender, the quickest way is to use some online calculators for mortgages and pretty much everything is a plug and play. In order to derive the equation on your own, there are also a few approaches. And here I am going to cover some of them. Using a financial calculator, using time value of money and the hard core way, derive the monthly payment using the definition of loan.

1. Using Texas Instrument BA II Plus financial calculator

It is pretty easy to use the financial calculator to calculate the monthly payment. I uploaded a Youtube video.

You can also quickly check with the calculator in Google to confirm the calculation was right.

Screen Shot 2018-10-28 at 9.41.48 AM.png

2. Time Value of Money

In finance, people like to evaluate the intrinsic value of an asset to be the net present value of all future cashflows. In a 30 year fix rate mortgage with equal monthly payment, we can assume that the monthly payment has the value pmt. And for the payment at period N, the net present value of pmt should be discounted using the internal rate of return which should be period interest rate, monthly interest rate.

For example

NPV(pmt_12) = pmt / (1+r)^(12) where r = Annual Interest Rate / 12

In this case the net present value of all future cashflows will be sum of all monthly payments.

Screen Shot 2018-10-28 at 4.21.11 PM

3. Monthly payment derivation

Screen Shot 2018-10-28 at 9.34.59 AM


put the numbers into the equation, we got the right result.

Screen Shot 2018-10-28 at 4.27.22 PM.png

LIFO Reserve

LIFO(Last In First Out) is a commonly used inventory recognition method, mostly in the United States. It will assume that the latest inventory will be sold first from the accounting perspective, not necessarily physically. Since the inventory cost usually varies, in that case, even for the same physical goods and same resale, different ways of recognition the cost might lead to different financial performance on the paper. LIFO is said to have positive impact on the cashflow when inventory’s cost are uptrending because recognizing expensive cost will reduce profit, hence, less tax.

This article will focus on a terminology which is called LIFO reserve. It is defined to be the difference between inventory amount recognized under two different methods LIFO and FIFO. By having LIFO reserve, inventory value and COGS under one method can be converted to another easily.

Now let’s see how to adjust some of the numbers when compare a company reported under LIFO to the ones that aren’t.


The inventory amount under LIFO will need to add the LIFO reserve in order to reach the inventory amount under FIFO


COGS under LIFO should subtract the increase in LIFO reserve to reach COGS under FIFO

Now let’s explain why, we add a number to the end of each variable to represent the year, XXX_1 means year one, so on and so forth. Based on the definition:

LIFO_reserve_1 = inventory_FIFO_1 – inventory_LIFO_1   (Equation1)

LIFO_reserve_2 = inventory_FIFO_2 – inventory_LIFO_2   (Equation2)

inventory_FIFO_2 = inventory_FIFO_1 + Inventory Bought_FIFO – Inventory Sold_FIFO (Equation 3)

inventory_LIFO_2 = inventory_LIFO_1 + Inventory Bought_LIFO – Inventory Sold_LIFO (Equation 4)

Under different methods, inventory bought is the same because it is fixed money paid for new inventory. The wiggle room is that inventory sold can be adjusted depends on which inventory you assume to be sold.

By subtracting Equation 4 from Equation 3, we have

inventory_FIFO_2 – inventory_LIFO_2 = inventory_FIFO_1 -inventory_LIFO_1 – Inventory Sold_FIFO +Inventory Sold_LIFO

Rearrange it,

Inventory_Sold_LIFO – Inventory_Sold_FIFO = (inventory_FIFO_2 – inventory_LIFO_2) – (inventory_FIFO_1 – inventory_LIFO_1) = LIFO_reserve_2 – LIFO_reserve_1 = increment_LIFO_reserve

In one of the CFA Level1 practice problems, there is a problem which well explained how everything fits together.

Screen Shot 2018-10-20 at 11.26.06 AM

The return on assets under LIFO for 2014 is 178/5570 = 3.2%.

Under FIFO, we need to adjust (increase) not only the net income due to the decreasing profit but also increase the asset by adding up the LIFO reserve. Both the numerator and denominator will increase by to different level.

Net_Income_FIFO = Net_Income_LIFO + LIFO_reserve_change adjusted by tax

Total_asset_FIFO = Total_asset_LIFO + LIFO_reserve_2014 adjusted by tax

Net_income_FIFO/Total_asset_FIFO = Net_income_LIFO + (867-547)*(1-t) / [Total_asset_LIFO + 867 * (1-t)] = (178 +(867-547)*(1-33.3%)) / (5570 +(867)*(1-33.3%)) = 6.4%.

The return on asset pretty much doubled under FIFO.

(867-547)*(1-33.3%)) / (867)*(1-33.3%)) = 36% which is pretty high. In that case, by analyzing the LIFO reserve, especially, the change percent in the lifo reserve and how much lifo reserve relative to total asset will give you a good picture of return on assets under different methods.





A Brute Force way to Auto Detect Signs in Financial Statements

When reading about the financial statements for any given company, or any spreadsheet in general, a situation any person constantly run into is numbers are presented in a slightly inconsistent way regarding the signs (+/-). For example, if there is an expense line item, people might assume it is an expense (outflow of cash) and will present the way as is. Sometimes, people will actually consider expense is different from income and will present it in a different way from revenue by including the numbers in a brackets like (400). Things can get really complex because there are metrics derived from a series of basic metrics like net income (bottom line on statement of operations) which should be the ultimate result after adding/subtracting the relevant gains and losses. I always have a hard time by calculating the final results because the signs for each line always confuse me. In the end, I ended up playing with signs for each line with my fingers crossed, wishing that I will be “lucky” enough to make the math work.

Screen Shot 2018-10-06 at 10.37.44 AM

Above is part of the cashflow statement from unaudited 10-Q for Arrow Electronics, Inc., a public listed company. They did a great job because the bottomline for the cash used for operations are a simple arithmetic sum of all the number, given you treat numbers in parenthesis as negative, hence to subtract.

Screen Shot 2018-10-06 at 10.42.19 AM

However, I want to use this data as an example to see how my script will be able to “auto detect” that the numbers in parenthesis should actually be subtracted. The Python script will take a brute force approach to find the right signs for each line item in order to reach to the right “bottom line”. The idea is very simple, it will switch signs for each line (positive, negative or exclusion) until it finds all the right sign and approximate to the bottomline within certain error tolerance.

Now let’s take a quick look at the implementation.

Screen Shot 2018-10-06 at 10.55.48 AM

First, we load all the data points into a data structure, since we are not sure about the signs for each line, or even should we include one line item or notat all, we will treat them all the same as positive numbers to get started. Benchmark variable is the final answer that we need to match. Base variable is 3 because for each number, it can either be treated as a positive number, a negative number or not included (for example, there could be an intermediate variable. Including the intermediate variable along with all the basic variables will lead to double count).

Screen Shot 2018-10-06 at 10.59.09 AM

baseN is a utility function that I borrowed from Stackoverflow, it can convert from a integer into a string format representation of any base. For example, baseN(4, 3) will be 11. We will use this function to help us loop through all the possibilities.

Then the next step is to try out all the possible combinations. Frankly speaking, I should have done way more research than I did but the first thought that came to my mind is to loop through all the possible combinations and represent the each state as a number of base 3. For example, in total, we have 11 elements. Then we have a number of 11 digits which each digital can have three states (negative, not included, positive). It can be represented as (0, 1, 2) for each state. In that case.

00,000,000,000 represents a possibility where all numbers should be treated as negative and clearly the total sum will not add up to our benchmark. Then next,

00,000,000,001 represents a situation all numbers except “Other assets and liabilities” 123769 will be negative, and only 123769 will be excluded. And of course, it will not add up.

00,000,000,002 means treat the last number as positive leaving the rest as negative.

00,000,000,010 means treat the second to the last as excluded and the rest of negative.

So on and so forth, when the number keep increasing till 11,111,111,111, then we should have iterated through all the possible combinations, hence, np.power(3, 11) ~ 177K.

Screen Shot 2018-10-06 at 11.08.54 AM

Weight is a variable that stores the sign for each variable. The code is pretty straightforward and I just want to clarify that the use of np.isclose is essential because all financial statements do some sort of rounding. It will be really hard to perfectly add up all the numbers and by tuning the error tolerance will be critical, something within 1% is probably a good rule of thumb.

Screen Shot 2018-10-06 at 11.14.32 AM

In the end, the final outcome will look like this, as you can see, since I am comparing using the absolute value, the script gives me two answers where it successfully identified the lines items under “changes in assets and liabilities” should be treated as a opposite sign as the rest of the items.

There are several limitations to my implementation just out of the box. The first one will be the scalability. The complexity of this brute force approach will grow exponentially as the number of variables increases.

Here are some interesting ideas for further exploration.

  1. How can you build mathematical relationship automatically within a financial statement or even across financial statement, across different years. For example, AR change in cashflow is the difference between end AR in balancesheet, ..etc.
  2. How can you solve the scalability issue by aggregating numbers to reduce number of variables, for example, now we can exclude all these 11 variables from future computation because they can be replaced as one variable – Net Cash used for operating activities.


Antidilutive in EPS calculation

EPS (earnings per share) is a very important ratio in income statement, it is calculated as earnings (net income) attributed to common shareholders divide by common shares outstanding. Actually, it is so important that it is required to include EPS on the face of the income statement.


As you can see, they not only show the EPS, there is also another line right below it which is the diluted EPS. The reason that diluted EPS need to be disclosed to the public is that there are different kinds of equity like preferred stock, convertible stock that has the potential of “diluting” the EPS. How big a difference is can be? Usually it is pretty small, like for Walmart, the earnings per share is only 0.01 but in some cases, the difference can be material enough that investor want to know the potential downside.

Say for example, convertible stock sometimes got paid dividend and can also be converted to certain amount of shares. If not convert, that is the simple calculation for EPS, however, the diluted EPS to evaluate if all the convertible stocks got redeemed into common stock, on one hand, the net income will increase because the earnings that used to go to dividend now can be retained, on the other hand, the number of outstanding common shares also increased due to the conversion. In this case, there is a scenario where the diluted EPS if converted can actually be higher than the basic EPS, if this happens, the diluted EPS sort of loses its meaning of providing a good projection of the potential downside. Both IFRS and GAAP require that this kind of EPS – Antidilutive Security be excluded from the diluted EPS calculation.

Now, let’s do some simple calculation and see under what situation Antidilutive security could exist.

Say a company’s net income is I, number of common shares outstanding is C and number of preferred stock is P. The term for preferred stock is that the annual dividend paid per share is D and it can also be converted to X amount of common stock if wanted.

Basic EPS = (I – P * D) / C

Diluted EPS = I / (C + P * X)

The constrain is that Basic EPS >= Diluted EPS

(I – P * D) / C >= I / (C + P * X)

After a bit transform, we got: X*I – D*C – P * X * D >= 0

I like to rearrange it into the following format:

D <= I / (P + C / X)

This is easy to interpret, P+C/X can be interpreted as if all shared got converted into preferred shares. If the dividend is smaller than if all converted to preferred stocks, then it is dilutive. If not, then it is anti-dilutive which should be excluded. So in this case you can see, if the dividend for the preferred stock is too high, or the conversion X is too small, it is highly likely that the constrain will not hold and it will be anti-dilutive. Also, if the number of preferred stock is substantial, this will also become anti-dilutive.





Shark Tank Valuation

There is this fantastic TV show Shark Tank from ABC, where startups come to pitch their business to a group of seasoned businessmen/women. The “sharks” will decide if they should invest their own money in those startups, hence, their motivation is to seek for high return and the questions they asked are usually challenging. I have always been asking myself how can the sharks invest that much amount of money, usually hundreds of thousands of money for partial ownership – equity within just a few minutes, and one step further, how can they even tell how much a business worth in the first place? Sometimes, they call out the entrepreneurs “crazy” valuation and sometimes they take the offer the way it is, and sometimes even go beyond what was originally asked.


After watch it for a few seasons, you started to see patterns where they pretty much want to understand their income from revenue, margin (in order to calculate gross profit) and net earnings (take home). Relatively speaking, not that much questions about balance sheet or cashflow. Clearly, their valuation are based on those key financial metrics, but the question is how do they go from those metrics to how much a company worth? Thanks to some of the heated conversations every now and then, the sharks will make similar comments like “you are asking for a valuation X multiple of your earnings, you know that will never happen in Y industry, right?”.

Then you have been told the secret, the next question will be WHY a company’s worth is a multiple their earnings and why that ratio is different across industries?

To determine how much anything worth, there are usually several ways, what is the cost, what is the worth of similar product that have been sold recently, and if you ask someone in finance, they probably will threw you off by saying “the current worth of an asset is the sum of all its future cashflows discounted to today”.

Lots of buzz words, right, “cashflow”, “discount”, “asset”..etc. Do not worry, everything should be pretty straightforward after we go through a every day example.

First is to make sure that we are all on the same page that “one dollar tomorrow is not the same as one dollar today”. This is called the time value of money and can be explained using your interest paying saving account. Say you have a very good bank with an interest rate of 10%. There are lots of details go into how actual interest (yield) could different based on when interest are paid and how often are paid, let’s put it on the side and assume $10 is paid the end of the year if you put $100 at the beginning of the year. The you will have $110 at the beginning of next year and 10%*110 = $11 worth of interest will be paid end of the next year, interest will get higher and higher thanks to the interest built on top of earned interest, everyone favorite compounding. So your $100 dollar today look like “worth more” in the future.

Year 0: $100
Year 1: $110
Year 2: $121

Year N: $100*(1+10%)^N

However, in nature, they are same because you take no risk in this investment since it is the bank (I did not go through 2008 so bear with my naiveness here), you do not have to do any work other than providing capital, last, this kind of investment is open to pretty much everyone. Therefore, any reasonable investor shall “take this for granted” if they decide to park their money somewhere else.

If someone is going to ask you to invest or borrow your money, you should expect they pay you back higher than what they borrow you today, or vice versa, you should expect to lend out lower if they promise to pay you back a fixed amount in the future.

From another perspective, $100 next year will equal to $91 today, the same as $110 to $100 in the previous example; $100 two years will equal to $83 today, the same as $121 to $100. Now, we are ready to look at how a company will generate money for its investors.

A company will bear the goal of making profit, making profit to keep its day to day operations (if not, they will not live long), sometimes even to have some profit after paying for all kinds of expenses which is called the net income / earnings. Earnings sometime are reinvested back to the business to scale up its success and sometimes those extra money will be given back to its investors. Of course, depending on the stage of the company and its potential based on various reasons (industry, business model, environment, ..etc), investors will expect different use of those earnings, that is also why some companies pay dividend up to 10% and some tech companies pay literally no dividend. Let’s say the business that go to shark tank is already operating, has already attracted a fixed group of the customer with returning business, the revenue pay for everything with extra. If I am the owner of this business, we can start by assuming all earnings go to our pocket. Let’s even fixed everything else assuming that from now on, they company will make same amount of earnings ($100K) every year for the future. In that case, it is a pretty good way of collecting cash in the future. Using the idea we mentioned, let’s try to convert all those future annal net income to today’s worth. Shall we use the interest rate above? I will say yes for now, for in reality it is probably not this simple, the risk of running a small business is high, you probably will need to put into effort, so you probably should expect an return rate higher than bank interest.

Year1 -> today: $100K / (1+10%)
Year2 -> today: $100K / (1+10%)^2
Year3 -> today: $100K / (1+10%)^3

YearN -> today: $100K / (1+10%)^N

In order to give away this cashcow to you, the entrepreneur probably will trade you fairly at the price of receiving the same amount of the money in the future.

So sum will be a very simple geometric series where the sum can be represented in the following equation:

valuation = $100K/(1+10%) * (1-q^n) / (1-q) where q=1/1.1=1/(1+rate)

Now that we know the magical multiple is pretty much

valution/earning = (1+r)(1-1/(1+r)^n)/(1-1/(1+r))

I plot out a heatmap based on “multiple” using the number of years and IRR (internal rate of return). The interesting pattern is that as the rate increases, the multiple caps at certain value, the higher the rate, the sooner it flattens and the lower its cap is.

Screen Shot 2018-09-16 at 10.48.44 PM

I also expanded the limits of each variable from 1 to 100 and here is the bird view.

Screen Shot 2018-09-16 at 10.50.29 PM.png

As an investor, I might ask myself the question of how long this business might keep making money for me (dividend or capital gain due to reinvested earnings), what will be my required rate that I will expect taking the market interest rate and the premium that I want to add on top of that.

Screen Shot 2018-09-16 at 10.56.45 PM

One can easily tell that the bank rate for investors is usually the big index funds and S&P 500 has already achieved an annual return of ~ 11%. If we double that benchmark to 22% as the required rate of return. Then we can see that by holding the company forever will earn no more than 7 times the original investment. Again, the power of compounding, $100 in 7 years will only worth $25, $5 in 15 years and roughly $1 in 23 years.

As you can see, if your business is attractive enough that your investor believe whatever you do will keep existing forever, also, if you can convince your investors that your business is less risky and at the same time, there is no good equivalent on the open market for investors (low interest), then you will position yourself to the top right of the chart. Of course, this article is based on a strong assumption that the earnings is fixed which it is never, it either can operate only for a few years (less than half of the small business survive the first 5 years), or the earnings could be volatile upwards and downwards for various reasons.

But in the end, hopefully this article is helpful to understand how a company worth from the most basic perspective and what are some of the factors investors and entrepreneurs need to pay attention to fairly evaluate and determine a company’s worth.

Part words from Shark Tank again “in sales, we trust”


  1. SharkTank from ABC, screenshot, comments, copyright reserved to the original author.
  2. Yahoo Finance

Degree of * Leverage

Within Corporate Finance, there is a focus area where management optimize their cost structure. There are several key measurement to identify business risk under the name of leverage. They use the term elasticity of one percent change in one variable with regard to unit percentage change in another variable. The most commonly used degree of operating leverage and degree of financial leverage.

Attached is a screenshot of how I derived the equation based on the definition.

Screen Shot 2018-08-19 at 7.54.02 PM

In the end, the term degree of total leverage is defined as %change in net income over %change in units sold. Once should reach to the conclusion easily that DTL = DOL * DFL

Return-Generating Models: The market Model Beta

R_i = alpha_i + beta_i * R_m + e_i

where the R_m is the return of the market. You can simply get the monthly return for a given stock and the S&P 500 as the market. Run a linear regression and the slope will be your beta.


Based on this post from Quantitative Finance from Stackexchange, the Beta calculated was based on the monthly return for the past three years. Comparing with our calculation in Python, the number lines up pretty well.

Note: I was using the close price, when using the open price, it was pretty close too. However, the beta calculated using high and low is pretty different from Yahoo Finance.