Feature Construction for LM – Polynomials

In Sutton’s book “Reinforcement Learning – An introduction” draft 2017 Nov Chapter 9.5.1. The Author discussed a scenario where one can construct features for a linear model using “interactions” between different dimensions of a state.

Screen Shot 2018-02-10 at 10.43.40 AM

It is a very short math equation but not quite straightforward to fully comprehend. Let’s raise a few examples to help understand this equation more intuitively and hopefully we can understand why there are (n+1)^k different features.

Let’s assume that we have three dimensions in the state space, like the physical spatial position of an object. In this case, k=3. Let’s assign different values to n starting from value 0 and see how that math equation unfolds as we grown n.

Screen Shot 2018-02-10 at 11.00.40 AM



Sympy – Solver

Want to solve a equation using Python, Sympy solver is an easy solution. (I wish twenty years ago I had access to this library so I might do a better job for my school homework)

Screen Shot 2018-01-13 at 6.17.25 AM.png

I am taking a course fundamentals of supply chain from MIT on EDX, there is a small math equation at the beginning of the course when you need to calculate the coefficients for a SKU distribution following power law. To be more specific, the question assumes the distribution follows the power law y = a * x^b where y denotes the percentage of the  items sold and x denotes the number of products sold. At a high level, this is a law behind a few popular products encompass majority of your sales.

Consider the example of a store where 5% of the products account for 66.6% of the items sold, and 50% of the products account for 95% of the items sold. – Course

In this case, we substitute x and y with the real store data, we have

0.666 = a * (0.05)^b
0.95 = a * (0.5)^b

Two variables, two independent equations. You can solve this equation by cancelling any unknown variable. Say let’s get started with cancelling variable a first. We have:

(0.05)^b / 0.666 = (0.5)^b / 0.95

Then we take the natural logarithm of each side, we have:

b * ln(0.05) - ln(0.666) = b * ln(0.5) - ln(0.95)

Finally we got,

b = (ln(0.95) -ln(0.666)) / (ln(0.5) -ln(0.05)) = 0.154

a = 0.95 / (0.5^0.154) = 1.06

Double check, a * (0.05)^b = 1.06 * (0.05)^0.154 = 0.668 ~ 0.666 (pretty close, probably because of the rounding which tends to get amplified in non-linear equations).

Well, as you can see, this equation is not the end of the world but it took me about 10 minutes to write down the equation and punch many buttons on my calculator to get the intermediate steps and what if I just want to get the result.

Here is how you can use Solver to get the result in a faster, more consistent and accurate way:

Screen Shot 2018-01-13 at 6.50.45 AM

Here we go, 5 lines of code, do I really need to explain what happened? I do not think so, right? 🙂

Sympy seems like to have way more functionalities other than just solving elementary school algebraic problems, it also claims to cover ordinary differential equations, partial differential equations, system of polynomial equations and more than that.

(BTW, if you just want to quickly get your hands dirty, Sympy has a live code shell where you can test out the example code or your own problem)


RL Trading – code study of sl-quant

There is this very interesting post from Hackernoon where the author built a self-learning trading bot that will learn and act accordingly to maximize the reward. The post is very fun since demonstrated the learning capability under a few naive models. On the other hand, it is actually the underlying implementation that intrigued me the most which I decided write this blog and go through the notebook provided by Daniel and learn more what is happening under the hood.


Screen Shot 2018-01-08 at 10.33.37 PM

The get_reward function is probably one of the most important steps in designing a reinforcement learning system. In an trading example, it can be somehow straightforward at first glance because people can simply use the financial position ( like P&L) as the measurement, actually it is probably the one method many people will agree and adopt.

We can start by first looking into the terminal_state == 1 part of the code, which indicates that is the last step of the process. In this case, the author simply call the Backtest function straight out of box by passing in pricing information and the signal information, in this case, each element of xdata stores the state which contains the current price and the difference comparing with the previous day. Hence, [x[0] for x in xdata] is a list of all the prices. In this case, you can grab the P&L data for the last day and you are good to good. (click here to learn more about the backtest functionality within twp)

The most interesting part is actually by looking into how the intermediate steps rewards are calculated. Within the terminal_state = 0, there are two if statements all based on the value of signal[timestep] and signal[timestep-1]. Signal is “Series with capital to invest (long+,short-) or number of shares”. In that case, signal[‘timestep’] and signal[timestep-1] is the capital to invest for the current and previous step. The interesting part if the both of them are equal to 0, basically means nothing to invest, the author actually deduct 10 points from the reward variable, I think this is to penalize the activity of doing nothing probably. Then, the step of where the signal for today and yesterday are different, this is probably the most challenging part of this reward function.

It built a while loop and go backwards in time to compare the two consecutive days of signal until they are equal. Actually, if the previous step and the one before it happen to have no change. Then i = 1 and it jumped out of the loop and stays at 1. However, if this is a very active investor and every day is different. i could increase to be as large as timestep – 1.  The code screenshot got cut off and here is the complete code for line 94.

Screen Shot 2018-01-08 at 11.01.48 PM

Here I am having a hard time understanding how this reward function is established. I can understand the price difference part. And multiply by the number of shares gives you basically the profit, or loss if there is a price drop. At the beginning, I thought the author must really hate making money because he multiplied that profit by a negative 100. Later on, I realized that the signal is interesting, it could be a positive number which means the investor is in a long position of owning certain stocks and a negative number indicates the investor is in a short position cashing out his stocks. So if this person is selling, then signal is negative, and if the price increased, this actually ended up a big positive boost to the reward. +price * -share * – 100 = +100 P&L. Last but not least, he also added a component where he multiply the number of shares he owns yesterday by the number of days that he hold and divide by 10. This is also a positive number and grows linear as the number of days he holds this position (i) and by the absolute value of his position. Bigger the deal is, and this will lead to a bigger reward. In this way, how this second component in the reward function is structured probably will boost the performance of holding a big amount of stocks for a long time. This is interesting because the reward function is crafted in such a way that encompass several key data points but in the end, it will collapse back to the P&L, I am wondering why he did not use P&L just across the who journey.

All things said, this is actually one way of how reward function got implemented.


Screen Shot 2018-01-08 at 11.29.39 PM

So here, the inputs to the evaluate_Q function are a trained prediction model – eval_model and a eval_data, which is a list containing a time series of pricing information.

First, the variable got initialized into an empty panda series. and then the pricing info got to be used to initiate the xdata (a time series of all the states) in which variable state got initialized to be the first state. Next comes the grand while loop that will not terminate until the end. Each loop represents a time step, a day, a minute, or a second given how your time step is defined. At the beginning of the loop, the model will be used to predict the Q value for the given state. This will generate a value for each action taken under that state. In this case, we have only 3 different actions, buy, sell or hold. In this case, the eval_model.predict is supposed to generate a score Q value for each of those action taken. Whichever has the highest score will be deemed as the best action and be acted upon. Given the action taken, this will bring the process into the next state – new_state, time_step will increase by one and the signal variable will also be updated within the take_action function. In this case, the signal variable keep appending and appending until the end of the simulation. The eval_reward is actually calculated at every step and actually only eval_reward in the last loop got used.  In my point of view, maybe the author should move that part of the code outside the while loop to improve some efficiency.


Screen Shot 2018-01-08 at 11.50.40 PM

This is where the magic happens. First, it is the signal got initialized to be an empty panda series. Then it goes into a for loop, and the number of the loops is depending on how many epochs the user wants this model to run. Within each epoch, all the key variables got reinitiated but seems like signal variable got to escape from this process.

Within each epoch, there is this while loop looping through each state. This part of the code is actually fairly similar to evaluate_Q, you will see as we read more. First the model is being used to predict the Q value, however, before the Q value got used to pick the next optimized action. There is a random factor for exploration where there is chance that the next action will be randomly chosen to avoid local optimal or overfitting. Otherwise, actually most of the cases assuming you picked up a fairly small epsilon, the next action will be based on the estimation of the Q value. After the action being taken, all the key variables should be updated and we will land in a new state, then the new state will be used to predict the next Q value and now we can calculate the update to be the reward for the current step plus the best estimation for the next step at that time. Then this cycle will be fed to the prediction model and further enhance its capability to predicting the Q value for a given state and the right action to take.

Screen Shot 2018-01-09 at 12.06.21 AM

Again, I just want to highlight that how the recursion happened here. Probably a whole article should be contributed here to explain the mathematical reason behind how the model is updated and why the update is the reward with an attenuated future Q value. If you want something quickly and dirty, this stackoverflow is help explaining the difference between value iteration and policy iteration from the implementation perspective.

This is basically it, a more detailed explanation of the source code behind the interesting post about self learning quant.


twp – Backtest module Part 1

twp – tradeWithPython is a utility library meant to help quant who uses Python. You can find the source code here and the documentation here. This library has been used by several projects so I am going to take a dive into the backtest module and show how to use it (given the author did not put too much thought into the documentation, or he thought it is already straightforward enough :)) .

You can find the source code for the backtest module here. At a high level, backtesting in the financial realm refers to “estimate the performance of a strategy or model if it had been employed during a past period.” This will enable quants to quickly evaluate any given investing strategy without conducting real experiment nor waiting for another significant amount of time while still gain some real life insight with confidence by doing simulation on existing data. For example, backtest is the first class citizen of the popular algo platform quantopian.


Anyway, now you understand how important backtest is in testing algo now, let’s go through the source code twp.backtest to look through how a backtest module got implemented and what are the key metrics got captured there.

Let’s take a look at the class Backtest directly, the constructor itself has included all the key data elements outright.

Screen Shot 2018-01-07 at 4.38.12 PM

Price is a panda Series which contains the time series of pricing information. If we are looking at stock price, pd.Series([1,2,3,4,5]) could represent the information that the given ticker is $1 per share on the first day, say Monday, has a one dollar increment for the rest of the week, ended $5 dollar on Friday. Signal is a variable that contains the financial activity or trading operation, for example, [NA, NA, 3, NA, -2] is a valid Signal variable which can be interpreted as the investor is in a long position of three shares of stock on the third day and in a short position of two shares on the last day, assuming the signal type is “shares” versus “capital”. The rest of the parameters are fairly easy to understand.

Now lets look at the body of the source code of the constructor:

Screen Shot 2018-01-07 at 4.46.34 PM

At line 117, signal got “cleaned up” by call ffill() followed by fillna(0). These two methods used in conjunction is very common for dealing with time series information with missing values. Using the example where signal = [NA, NA, 3, NA, -2] again, ffill() is the same as locf in R, which in essence is to fill the missing value with the last non missing value. However, for the leading missing values, like the first two NAs in our example, there is no preceding valid value, then it will stay NA. After that, fillna will replace all the NAs, in this case, the leading NAs with whatever value got passed along, which is set to 0.

Screen Shot 2018-01-07 at 4.54.53 PM

Next is the tradeIdx variable, it is basically the difference between every two pair of consecutive elements so theoretically, tradeIdx is exactly one element shorter than signal. However, for the very first element of the Series, there is no element prior to it to subtract with, it will be filled with NA so it has the same length as the input Series. Then fillna(0) will replace NA with 0 right after that.

Screen Shot 2018-01-07 at 5.15.55 PM

Then tradeIdx will be used to slice the signal and store the trade into the variable self.trades, keeping the index number.

Now, let’s talk about the “shares” variable. It was calculated in this way

tradeIdx = signal.ffill.fillna.diff.fillna != 0
trades = signal[tradeIdx]
shares = trades.reindex.ffill.fillna

In essence, “shares” is basically “trades” and basically “signal”. So that in case, to calculate the delta, or in another way, on what day did the investor sold the stocks, we need to calculate the difference of shares to calculate the delta.

Screen Shot 2018-01-07 at 5.53.33 PM

The screenshot above clearly explained how it looks like. And let’s translate the verbose code into plain English, which might be a bit easy to interpret. In the end, this basically describe a  scenario where this person netted $6 by borrowing money to buy 3 shares of stock on the third day. And flipped it on the last day, where the price per share increased by 2 dollar. That left a $6 net profit on the book. At the same time, this person not only sold all the shares he borrowed, he is even in a position of “-2” on the last day, which indicates that he sold some shares that he even does not own. He could be borrowing two shares from somebody else at the value price of $5. He could have sold it on the next day and then buy it back in a future when the price is low, in that case, this person can not only pay back the two shares to its original owner but also profit since the he is in a short position and the price dropped in his favor.

Let’s give another example with some initial cash say $100 at the beginning and put this person in a short position given a growing underlying stock price – unlucky.

Screen Shot 2018-01-07 at 6.03.43 PM

As you can see, this person borrowed and cashed out some stock on the third day, -9 on value and 109 on cash. Then the underlying price of stock keep going up and the three shared that he owned used to worth 9 dollars now he is in the situation which he owes 15 dollar worth of stock, which net a loss of 6 dollars.


Screen Shot 2018-01-07 at 6.16.28 PMScreen Shot 2018-01-07 at 6.16.47 PM

Then in the backtest, the author implemented a method for a class called sharpe and there is also a utility function defined outside the class called sharpe with an argument which is the daily sharpe ratio, it basically convert from daily to annualized sharpe.

Screen Shot 2018-01-07 at 6.28.59 PM

Today, we have covered the majority of the logical part of the backtest module, however, there are still a few functions like plotting that we need to further evaluate in the second part of the tutorial.



Tensorflow – “Wide” tutorial

I have not visited Tensorflow for quite a while and recently had a use case where I need to do some classification and want to give deep learning a whirl. I am surprised to find the number of tutorials/examples that have been added in the past few month. Today, I am going to give the “Tensorflow Linear Model Tutorial” an overhaul and carefully study the functions that have been used in this tutorial.

The use case has been well explained at the beginning of the tutorial but the sample code, for example, the input_fn is a bit daunting for the people, at least me, at a first glance. Here is a list of study notes that I have taken regarding each of the functions that have been used in input_fn.


Within the module of tf.gfile, there are so many utility functions not only limited to Exists, for example, Copy, Remove, ..etc. In this way, you can do everything using tensorflow without having to dabble with the os.path library and others, which might be a good option for developers who prefer minimizing the amount of dependent libraries in his/her code.

I have created a Jupyter notebook where demonstrate some of the functionalities using the gfile. Hopefully the code and message can give you a more intuitive feeling of how to use those functions.


Screen Shot 2017-12-29 at 2.55.09 PM


The parse_csv function actually too me a while to understand. The input to this function has the variable name “value”. And then it got used to be the input to the function tensorflow.decode_csv. The one liner description for decode_csv is as followed:

Convert CSV records to tensors. Each column maps to one tensor.

And the record_defaults argument is definitely something you need to know if you got so spoiled by calling pd.read_csv a lot.

record_defaults: A list of Tensor objects with types from: float32int32int64string. One tensor per column of the input record, with either a scalar default value for that column or empty if the column is required.

For example, this is how a record_defaults could look like, in the example that we are looking at:

_CSV_COLUMN_DEFAULTS = [[0], [”], [0], [”], [0], [”], [”], [”], [”], [”], [0], [0], [0], [”], [”]]

You can tell we have 15 elements in this list, and the first element is a list which has only one element 0, the second element is also a list that has only one element which is an empty string, so on and so forth. The value provided here is basically saying, for the first column of the CSV file, if there need to use a default value, like missing values, use 0 as the default. You can find the raw dataset from here. By reading the adult.names file, you can tell the first column of the data is the field “age”, which is a numeric field which has values like 21, 50, etc. In this case, using 0 as a default value for numeric field makes perfect sense. The second column is “workclass” with values like “self-employed”, ..etc that an empty string is a legit default value for strings/categorical variables.

Here is another example from tensorflow source code using the decode_csv function. You can find the dataset used in the example by visiting here and the records_default is now defined as:

defaults = collections.OrderedDict([
    ("symboling", [0]),
    ("normalized-losses", [0.0]),
    ("make", [""]),
    ("price", [0.0])
types = collections.OrderedDict((key, type(value[0]))
                                for key, value in defaults.items())

#types = OrderedDict([
#    ('symboling', <class 'int'>), 
#    ('normalized-losses', <class 'float'>), 
#    ('make', <class 'str'>),
#    ...    
#    ('price', <class 'float'>)

As you can see, now the record_defaults argument is list(defaults.values()) which is basically the same as the previous example [[0], [0.0], [”], …, [0.0]]. The second example is very helpful because it shows you how to use an ordered list to manage the column types and column names so you can reuse it again and again after manually creating it once.

Screen Shot 2017-12-29 at 3.33.25 PM

The example above perfectly demonstrated how to use decode_csv in a minimal fashion. Since I am using tensorflow.interactive_session, I can simply call .eval() operation on any tensor object to print out for debugging purpose.

Features Dictionary

I want to briefly discuss how the features variable got generated. Clearly, it is a dictionary and the fascinating part is not only how it is constructed, but also how to exclude the labels(y) field.

The zip function is quite a useful function that many people underutilize. The following few lines of code not only demonstrated how to use zip, but also showed you how it is different from Python 2.7 and also how to construct a dict.

Screen Shot 2017-12-29 at 3.49.59 PM

After all of that, here is an example that might captured everything that you need to know about parse_csv but in a more complete context.

Screen Shot 2017-12-29 at 3.59.44 PM


Starcraft II – sc2client-api

I was reading about Deepmind is collaborating with Blizzard, trying to build some reinforcement learning empowered artificial general intelligence bot. As being a Starcraft fan since 2000, it is a very exciting feeling to read about the progress that has been made along with some of the fancy Youtube videos out there.

There are most two topics around sc2 bot, one centered around building the robot, and the second is the environment, i,e, how to interact with the game and programmatically control the units within the game. I guess writing a bot for Starcraft is not new, the game comes with its own map editor and you can customize the map and write a bot. I clearly remembered that my childhood friends and I spent almost a summer beating 7 bots on the map – Big Hunter in Starcraft I and then level up, pursuing the challenge of finding different implementations of “advanced bots” and beat them (one of the implementations is double mineral and gas for your opponent, that was crazy). almost two decades have passed, I still do not know how to write a bot yet. I have seen some crazy logic using Galaxy  and here is a great Youtube channel to show off the power.

Other than the map editor, Blizzard has also published a library for researchers and developers to interact with the game using generic programming language.

I came across two libraries that confused me a little bit, one is s2client-api and the other is s2client-proto. So here is the repo description for those two libraries.

s2client-api: StarCraft II Client – C++ library supported on Windows, Linux and Mac designed for building scripted bots and research using the SC2API.

s2client-proto: StarCraft II Client – protocol definitions used to communicate with StarCraft II.

Clearly, both those two libraries claim to be able to control the game via an API but s2client-proto was written mostly in Python while s2client-api in C++.

I have not yet looked into the Python solution yet merely because the name of the project s2client-api, it must be the official API, right? also, I usually have an assumption where if there is an equivalent of a functionality, one written in C++ and the other in Python, it is usually the later is a mere wrapper of the C++ solution, so I decided to first give s2client-api a try.

Other than you have to install lots of development tools if you are not an active C++ developer, like me, I have to install Cmake and latest Visual Studio 2017 community, which is actually pretty easy to do. The set up tutorial is pretty well written and easy to follow.

I was very excited to get the first tutorial up and running, seeing the SCVs started mining the minerals, .etc. The documentation is great and the three tutorials work right out of the box.

In the end, I have modified the code a little bit to implemented two logic:

  1. wait till you have 50 marines and then attack the enemy base
  2. instead of one barracks, build more

And here is the code snippet.


Last but not least, this is an exciting screenshot of how that simple logic on top of the tutorial3 killed my opponent.

This slideshow requires JavaScript.





GDAX – orderbook data API level1,2,3

GDAX is a cryptocurrency trading platform and it is a subsidy owned by Coinbase. They offer a great platform along with a suite of API services for easy access to the trading data – rear mirror historical view or real-time data.

I have used API before but GDAX suggests to use websocket instead of API for real-time data access.


“Only the best bid and ask”

Screen Shot 2017-11-28 at 7.29.28 PM

As you can see, on the trading platform, the best selling price (ask to sell) is 10077.92 USD/BTC and at the same time, the best buying price (bid to buy) is 10077.91 USD/BTC. There are 53 different orders bidding at the best bid price and the total bids size is 35.26 BTCs. Same for the asks.


“Top 50 bids and asks (aggregated)”

At level2, instead of the top 1 (best) bid and ask orders, the API will return the top 50 bids and the top 50 asks. The data is in the same format [price, ordersize, numberOfOrders]. Given this information, you will have enough information to plot the depth chart (investopedia explanation).

Screen Shot 2017-11-28 at 7.40.57 PM

However, the price range is so small (10112.14 – 10076.77) / 10076.77 ~ 0.3% only a small small price range, which is not sufficient enough to draw a meaning depth chart.

Screen Shot 2017-11-28 at 8.26.54 PM


“Full order book (non aggregated)”

This is a BIG request, however, the information is quite rich and it a complete snapshot of the orderbook. Hence, the response format is slightly different where they do not have order size anymore because record is at order level without any aggregation, instead, they provided the order ID.

As shown in a snippet of the response, they even listed the bids out there at the price of 0.01 USD/BTC. Those bids are probably the ones either added to the orderbook years ago or those were the disbelievers. On the other size of the spectrum, you can easily see people even ask to sell BTC at the price of 9,999,999,999 USD/BTC, good luck to them 🙂

Screen Shot 2017-11-28 at 8.30.33 PM

Screen Shot 2017-11-28 at 8.52.10 PM

This is a plot of the histogram of how orders are distributed on different prices. As you can see, there are a few peak values out there. The biggest peak is definitely on the lowest point, maybe under a dollar. Then they have a few big peaks around $800, $5000, $8000 and $10000.

This is interesting but not necessary the depth chart that you usually see.

Screen Shot 2017-11-28 at 8.53.34 PM

For example, this is the market depth from GDAX. And clearly, the X axis is still exchange rate of BTC and USD but the y axis is the number of Bitcoins at that price. I am not exactly sure of the resolution in this graph but let’s take a quick look and see if we can reproduce this graph in Matplotlib.

The code might not be the most efficient but it does but you want it to do.

Screen Shot 2017-11-28 at 10.42.59 PM


If you are on GDAX website, you can realize that your computer’s fan might be spinning a bit faster, behind the scene of numbers jumping up and down, you can the browser actually subscribed to the GDAX websocket to stream NRT trading information from their server to update the information on your website. And this actually the exact websocket that the developer manual suggest users to use to receive information feed.

Screen Shot 2017-11-28 at 10.44.57 PM

Of course, in this case, you will need to take the responsibility of parsing the information calculate the latest state and deal with everything yourself.

So far, you should have a rough understanding of what kind information are provided various levels of orderbook API from GDAX. We also briefly touched the GDAX websocket. In the next article, I am going to have a deep dive into the websocket data types and message format, hopefully, we can cover Mongodb in more detail regarding how to query the transaction information and gain analytics.

Python Write Formula to Excel

I recently had a challenge writing formulas to Excel file using Python. I managed to create the value of the cell as the right formula but it is not being evaluated or active when you open it up in Excel.

I was using the to_excel method of pandas dataframe. And then someone pointed me to this awesome library XlsxWriter. It supports many of the features within Excel including but not restricted to formula, chart, image and even customized format.

Here is a short snippet of code of showing how it worked just out of box.

Screen Shot 2017-10-12 at 10.28.41 PM

And the output looks straightforward and satisfying.

Screen Shot 2017-10-12 at 10.29.01 PM

Also, as you can see from the code, you should really try to get yourself out of the business of working with row index and column index directly. For example, whenever I think about you are going to use things like row=row+1 or i++, it reminds me of the languages like C++, Java which we should stay away from.

Here is another example of directly working with pandas dataframe using XlsxWriter as the engine.

Screen Shot 2017-10-12 at 10.54.33 PM

And then, this is how the output file looks like, we have a new column called col2 that are active links. Clearly, it has been evaluated and active, when you click on it, it will link you to/jump directly to the A1 cell of sheet1. Problem solved.

Screen Shot 2017-10-12 at 10.56.12 PM

Oracle SQLdeveloper Reset Expired Password

Today I was trying to access an Oracle database after being provided with the credentials, which I have been waiting for a long time, typical IT, isn’t it?

However, I am using a tool called “Oracle SQL Developer” which made this a super fun experience for me that made me want to document and share with people.

When I click to create a new connection and fill in all the needed information (BTW, the SID is actually the database name: Site IDentifier). It prompted me with an error message “the password has expired”.


I tried a few times making sure it is not my typo but every time it gave me the same error message. Also, I randomly entered a few strings and it gave me a different error message of “invalid username/password; logon denied“. This made me believe the password our DBA gave me was definitely right but it just expired. My first question was that my credential must just got created and “brand new”, how could it expire? Clearly, there is a difference between how milk went expired and how password expired.

After a few quick Google research, I realized that some DBAs will set the default password as expired out of box so it will force the users to reset the password, like many web subscription that the confirmation email will directly link you to reset password page. However, now knowing that I need to reset my password, the frustration part is how?

Looking at the connection wizard, there is no where/no button that I can click giving me the option of resetting the password. I navigated through all the buttons, drop downs and tabs and still couldn’t find any sign of “reset password”. “Test Connection” will keep showing this same error message and you are not able to “Connect Either”.

This reminds me of a scenario where it is your first day of employment and your manager told you that your badge is at the front desk. However, the security will not even let you into the building because you do not have your badge. To get the badge, you need to get into the building though to visit the front desk…

I notified my DBA of the awkward situation and he notified me that I need to reset my password in one sentence without any further explanation. I already feel like making a fool out of myself and I think I had better figure this “reset my password” thing all by myself.

In the end, believe it or not, I need to first ignore the error message, SAVE the connection as a valid connection. Then, you need to right click to bring up the menu for the connection and the “reset password” option will be there for you to use. From there, everything will be so straightforward where you enter your old and new password and then you are IN!

Screen Shot 2017-10-09 at 4.08.12 PM

This is purely a UI/UX problem where people should have considered but I just want to share this fun experience so it can save a bit time for those like me, who happen to be new to Oracle, who happen to be given an expired password, who happen to use SQLdeveloper and who happen to think you need to connect first before you change your password.


The regularization is a trick where you try to avoid “overcomplexing” your model, especially during the cases in which the weights are extraordinary big. Having certain weights at certain size might minimize the overall function, however, that unique sets of weights might lead to “overfitting” where the model does not really perform well when new data come in. In that case, people came up with several ways to control the overall size of the weights by appending a term to the existing cost function called regularization. It could be a pure sum of the absolute value of all the weights or it could be a sum of the square (norm) of all the weights. Here is usually a constant you assign to regularization in the cost function, the bigger the number is, the more you want to regulate the overall size of all the weights. vice versa, if the constant is really small, say 10^(-100), it is almost close to zero, which is equivalent of not having regularization. Regularization usually helps prevent overfitting, generalize the model and even increase the accuracy of your model. What will be a good regularization constant is what we are going to look into today.

Here is the source code of regularizing the logistic regression model:

logits = tf.matmul(tf_train_dataset, weights) + biases
loss_base = tf.nn.softmax_cross_entropy_with_logits(labels=tf_train_labels, logits=logits)
regularizer = tf.nn.l2_loss(weights)
loss = tf.reduce_mean(loss_base + beta * regularizer)
optimizer = tf.train.GradientDescentOptimizer(0.5).minimize(loss)

As you can see, the code is pretty much the same as the one without regularization, except we add a component of “beta * tf.nn.l2_loss(weights)”. To better understand how the regularization piece contribute to the overall accurarcy, I packaged the training into one function for each reusability. And then, I change the value from extremely small to fairly big and recorded the test accuracy, the training accuracy and validation accuracy on the last batch as a reference, and plotted them in different colors for each visualization.


The red line is what we truly want to focus on, which is the accuracy of the model running against test data. As we increase the value from tiny (10^-5). There is a noticeable but not outstanding bump in the test accuracy, and reaches its highest test accuracy during the range of 0.001 to 0.01. After 0.1, the test accuracy decrease significantly as we increase beta. At certain stage, the accuracy is almost 10% after beta=1. We have seen before that our overall loss is not a big number < 10. And even the train and valid accuracy drop to low point when the beta is relatively big. In summary, we can make the statement that regularization can avoid overfitting, contribute positively to your accuracy after fine tuning and potentially ruin your model if you are not careful.

Now, let’s take a look at the how adding regularization performed on a neural network with one hidden layer.


First, we want to highlight that this graph is in a different scale (y axis from 78 to 88) from the one above (0 to 80). We can see that the test accuracy fluctuate quite a bit in a small range between 86% to 89% but we cannot necessarily see a strong correlation between beta and test accuracy. One explanation could be that our model is already good enough and hard to see any substantial change. Our neuralnet with one hidden layer of a thousand nodes using relu is already sophisticated, without regularization, it can already reach an accuracy of 87% easily.

After all, regularization is something we should all know what it does, and when and where to apply it.