9 A quadratic equation is of the form y = ax2 + bx + c.
What if your data is actually more complex than a simple straight line? Surprisingly,
you can actually use a linear model to fit nonlinear data. A simple way to do this is to
add powers of each feature as new features, then train a linear model on this extended
set of features. This technique is called Polynomial Regression.
Let’s look at an example. First, let’s generate some nonlinear data, based on a simple
quadratic equation9 (plus some noise; see Figure 4-12):
m = 100
X = 6 * np.random.rand(m, 1) - 3
y = 0.5 * X**2 + X + 2 + np.random.randn(m, 1)
Figure 4-12. Generated nonlinear and noisy dataset
nomialFeatures class to transform our training data, adding the square (2nd-degree
polynomial) of each feature in the training set as new features (in this case there is
just one feature):
>>> from sklearn.preprocessing import PolynomialFeatures
>>> poly_features = PolynomialFeatures(degree=2, include_bias=False)
>>> X_poly = poly_features.fit_transform(X)
X_poly now contains the original feature of X plus the square of this feature. Now you
can fit a LinearRegression model to this extended training data (Figure 4-13):
>>> lin_reg = LinearRegression()
>>> lin_reg.fit(X_poly, y)
>>> lin_reg.intercept_, lin_reg.coef_
(array([ 1.78134581]), array([[ 0.93366893, 0.56456263]]))
Figure 4-13. Polynomial Regression model predictions
Not bad: the model estimates y = 0 . 56x1
2 + 0 . 93x1 + 1 . 78 when in fact the original
function was y = 0 . 5x1
2 + 1 . 0x1 + 2 . 0 + Gaussian noise.
Note that when there are multiple features, Polynomial Regression is capable of find‐
ing relationships between features (which is something a plain Linear Regression
model cannot do). This is made possible by the fact that PolynomialFeatures also
adds all combinations of features up to the given degree. For example, if there were
features a2, a3, b2, and b3, but also the combinations ab, a2b, and ab2.
PolynomialFeatures(degree=d) transforms an array containing n
features into an array containing n + d !
d! n! features, where n! is the
factorial of n, equal to 1 × 2 × 3 × ⋯ × n. Beware of the combinato‐
rial explosion of the number of features!
An introduction to neural networks We can describe a neural network as a mathematical model for information processing. As discussed in Chapter 1, Machine Learning – an Introduction, this is a good way to describe any ML algorithm, but, in this chapter, well give it a specific meaning in the context of neural networks. A […]
Introduction to genetic algorithms playing games For a long time, the best results and the bulk of the research into AIs playing video game environments were around genetic algorithms. This approach involves creating a set of modules that take parameters to control the behavior of the AI. The range of parameter values is then set […]
Using Keras to classify images of objects With Keras, it's easy to create neural nets, but it's also easy to download test datasets. Let's try to use the CIFAR-10 (Canadian Institute For Advanced Research, https://www.cs. toronto.edu/~kriz/cifar.html) dataset instead of MNIST. It consists of 60,000 32x32 RGB images, divided into 10 classes of objects, namely: airplanes, […]
Linear regression in TensorFlow In the next examples, we are going to perform linear regression using TensorFlow, helping you understand additional concepts you will need when training and testing your deep learning algorithms. More specifically, we are going to see three scripts. In each script, we are going to cover the following topics: tensorflow_linear_regression_training.py: This script generates the linear regression model. tensorflow_linear_regression_testing.py: This script loads the created model and uses it to make new predictions. tensorflow_save_and_load_using_model_builder.py: This script loads the created model and exports it for inference using SavedModelBuilder(). Additionally, this script also loads the final model to make new predictions. Linear regression is a very common statistical method that allows us to model the relationship from a given set of two-dimensional sample points. In this case, the model function is as follows: This describes a line with the W slope and y-intercept b. Therefore, the goal is to find the values for the W and b parameters that would provide the best fit in some sense (for example, minimizing the mean square error) for the two-dimensional sample points. When training a linear regression model (see tensorflow_linear_regression_training.py), the first step is to generate some data to use for training the algorithm as follows: x = np.linspace(0, N, N) y = 3 * np.linspace(0, N, N) + np.random.uniform(-10, 10, N) The next step is to define the placeholders in order to feed our training data into the optimizer during the training process as follows: X = tf.placeholder("float", name='X') Y = tf.placeholder("float", name='Y') At this point, we declare two variables (randomly initialized) for the weights and bias as follows: W = tf.Variable(np.random.randn(), name="W") b = tf.Variable(np.random.randn(), name="b") The next step is to construct a linear model as follows: y_model = tf.add(tf.multiply(X, W), b, name="y_model") We also define the cost function. In this case, we will use the mean squared error cost function, as demonstrated in the following code snippet: cost = tf.reduce_sum(tf.pow(y_model - Y, 2)) / (2 * N) Now, we create the gradient descent optimizer that is going to minimize the cost function modifying the values of the W and b variables. The traditional optimizer is called gradient descent (iterative optimization algorithm with the aim of finding the minimum of a function), and is shown here: optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost) The learning rate parameter controls how much the coefficients change on each update of the gradient descent algorithm. As commented before, the gradient descent is an iterative optimization algorithm and, hence, in each iteration, the parameters are modified according to the learning rate parameter. The final step when creating the model is to perform the initialization of the variables as follows: init = tf.global_variables_initializer() At this point, we can begin the training process inside a session, as demonstrated in the following code snippet: # Start the training procedure inside a TensorFlow Session: with tf.Session() as sess: […]
Just like other neural networks we have discussed, autoencoders can have multiple hidden layers. In this case they are called stacked autoencoders (or deep autoencoders). Adding more layers helps the autoencoder learn more complex codings. However, one must be careful not to make the autoencoder too powerful. Imagine an encoder so powerful that it just […]
Differences between RL and other ML approaches One of the most important distinctions between RL and other machine learning (ML) approaches is that in RL we have delayed rewards. That is, the agent might have to take a number of actions before the environment provides any reward signal. For example, in the maze game, the […]