[Week 2-code] Feature Engineering with TFX

(coursera) Machine Learning Data Lifecycle in Production - Feature Engineering, Transformation and Selection

5 minute read

Seunghan Lee

Seunghan Lee

Deep Learning, Data Science, Statistics

  • Seoul, S.Korea
  • Custom Social Profile Link

( reference : Machine Learning Data Lifecycle in Production )

Feature Engineering with TFX

Goal : building a DATA PIPELINE using Tensorflow Extended (TFX)

Dataset : Metro Interstate Traffic Volume dataset

  • create an Interactive Context to run TFX components

use TFX ExampleGen to split dataset

  • use TFX StatisticsGen & TFX SchemaGen to generate stat & schema
  • use TFX ExampleValidator to validate evaluation dataset statistics
  • use TFX Transform to perform feature engineering
  • import & define paths
  • create Interactive context
  • StatisticsGen
  • ExampleValidator

(1) import & define paths

설치 ( 반드시 런타임 재실행 할 것! )

불러올 (메인) 패키지 : tf & tfx

  • hourly traffic volume of a road in Minnesota from 2012-2018
  • goal : predicting the traffic volume given the date, time, and weather conditions

(3) create Interactive context

initialize InteractiveContext

2. TFX components

(1) examplegen.

Summary ( = Ingesting Data )

  • (1) split data ( train 2/3 : eval 1/3 )
  • (2) convert each row into tf.train.Example format
  • reason : for other components to access!
  • stored in TFRecord format

Example 1) ingest csv data

( = run the component, using InteractiveContext instance )

figure2

위와 같이, 데이터셋이 나눠진 것을 확인할 수 있다

figure2

잘 생성되었나 확인 가능

데이터 몇 개만 확인해보자!

  • URI : Uniform Resource identifier ( 여기서는, 데이터 저장 경로 )

Example 2) ingest csv data

지정한 개수 만큼의 example을 가져와보자. ( 함수 : get_records() )

get_records(dataset, num_records)

  • dataset : TfRecordDataset 포맷

결과 ( 3개의 데이터 예시를 가져옴 ) :

(2) StatisticsGen

데이터셋에 대한 statistcis를 계산하기 위함

  • TensorFlow Data Validaiton 사용

figure2

생성한 statistic을 시각적으로 확인해보자.

figure2

(3) SchemaGen

  • 스키마 : expected bounds, types, properties of features

figure2

생성한 schema를 시각적으로 확인해보자

figure2

이렇게 생성한 schema는, 뒤에서 anomaly를 detect 하는데에 활용된다.

(4) ExampleValidator

  • 위에서 생성한 schema & statistics를 바탕으로, anomaly를 detect하는데에 사용된다.
  • (default로) training & evaluation split을 비교한다

figure2

detect한 anomaly들을 시각적으로 확인해보자

figure2

(5) Transform

  • 위에서 생성한 examplegen & statistics를 바탕으로, feature engineering을 하기 위함
  • 수행하고자 하는 “전처리 함수” 또한 필요함
  • magic command %% writefile 을 사용하여, 전처리 함수 코드를 저장한다!

(2) 변환 대상 & 함수 정의 ( _traffic_constants_module_file )

Feature Engineering 하기

figure2

위의 InteractiveContext의 output cell을, .component.outputs 에서 확인할 수 있다.

  • training & serving에서 둘 다 사용될 것
  • transformed_examples : preprocessed training & evaluation data

Transform Graph의 URI 가져오기

Transform된 training data의 URI 가져오기

transform이 완료된 데이터 상위 3개 가져오기

figure2

You May Also Enjoy

less than 1 minute read

2 minute read

4 minute read

1 minute read

Week 2: Feature Engineering, Transformation and Selection

  • Week 3 — Data Journey

Week 2 — Feature Engineering

Introduction to preprocessing.

Feature engineering can be difficult and time consuming, but also very important to success.

Squeezing the most out of data

  • Making data useful before training a model
  • Representing data in forms that help models learn
  • Increasing predictive quality
  • Reducing dimensionality with feature engineering
  • Feature Engineering within the model is limited to batch computations

Art of feature engineering

Increases the model ability to learn while simultaneously reducing (if possible) the compute resources it requires.

week 2 assignment feature engineering

During serving we typically process each request individually, so it becomes important that we include global properties of our features, such as the $\sigma$ (standard deviation)

Preprocessing Operations

week 2 assignment feature engineering

Data clearning to remove erroneous data.

Feature tuning like normalizing, scaling.

Representation transformation for better predictive signals

Feature Extraction / dimensionality reduction for more data representation.

Feature construction to create new features.

Mapping categorical values

Categorical values can be one-hot encoded if two nearby values are not more similar than two distant values, otherwise ordinal encoded.

Empirical knowledge of data will guide you further

Text: stemming, lemmatization, TF-IDF, n-grams, embedding lookup

Images - clipping, resizing, cropping, blur, canny filters, soble filters, photometric distortions

  • Data preprocessing: transforms raw data into a clean and training-ready dataset
  • Raw data into feature vectors
  • Integer values to floating-point values
  • Normalizes numerical values
  • String and categorical values to vectors of numeric values
  • Data from one space into different space

Feature Engineering Techniques

  • e.g., grayscale image pixel intensity scale is $[0, 255]$ usually rescaled to $[-1, 1]$

$$ x_\text{scaled} = \frac{(b-a)(x - x_{\min})}{x_{\max} - x_{\min}} + a \tag{$x$ $\isin$ [a, b]} $$

Normalization :

$$ x_\text{scaled} = \frac{x - x_{\min}}{x_{\max} - x_{\min}} \tag{$x$ $\isin$ [0, 1]} $$

  • Helps NN converge faster
  • Do away with NaN errors during training
  • For each feature, the model learn the right weights.

Standardization

  • Z-score relates the number of standard deviations away from the mean

$$ x_\text{std} = \frac{x - \mu}{\sigma} $$

Bucketizing/ Binning

week 2 assignment feature engineering

Other techniques

Dimensionality reduction in embeddings

  • Principle component analysis (PCA)
  • t-Distribute stochastic neighbor embedding (t-SNE)
  • uniform manifold approximation and projection (UMAP)

TensorFlow embedding projector

  • Intuitive explanation of high-dimensional data
  • Visualize & analyze

Feature Crosses

  • Combine multiple features together into a new feature
  • Encodes nonlinearity in the feature space, or encodes the same information in fewer features
  • $[A \times B]$ : multiplying the values of two features
  • $[A\times B\times C \times D \times E ]$: multiplying the values of 5 features
  • $[\text{Day of week, hour}] \rightarrow [\text{Hour of week}]$

week 2 assignment feature engineering

  • Feature crossing : synthetic feature encoding nonlinearity in feature space
  • Feature coding : Transforming categorical to a continuous variable.

Feature Transformation at Scale — Preprocessing Data at Scale

To do feature transformation at scale we need ML pipeline to deploy our model with consistent and reproducible results.

week 2 assignment feature engineering

Preprocessing at scale

week 2 assignment feature engineering

Inconsistencies in feature engineering

  • Mobile (TensorFlow Lite)
  • Server (TensorFlow Serving)
  • Web (TensorFlow JS)
  • Skews will lower the performance of your serving model

Preprocessing granularity

week 2 assignment feature engineering

Applying Transformation per batch

  • For example, normalizing features by their average
  • Access to a single batch of data, not the full dataset
  • Normalize by average within a batch
  • Precompute average and reuse it during normalization

Optimizing instance-level transformations

  • Indirectly affect training efficeincy
  • Typically accelerators sit idle while the CPUs transform
  • Prefetching transforms for better accelerator efficiency

Summarizing the challenges

  • Balancing predictive performance
  • Full-pass transformation on training data
  • Optimizing instance-level transformation for better training efficiency (GPUs, TPUs,...)
  • Inconsistent data affects the accuracy of the results
  • Need for scaled data processing frameworks to process large datasets in an efficient and distribute manner

TensorFlow Transform

week 2 assignment feature engineering

Example Gen : Generates Examples from the training & evaluation data

Statistics Gen : Generates Statistics

Schema Gen : Generates schema after ingesting statistics. This schema is then fed to:

  • Example validator : Takes schema and statistics and look for problems/anomalies in data
  • Transform : takes schema and dataset and do feature engineering

Trainer : Trains the model

Evaluator : Evaluates the result

Pusher : Pushes to wherever we want to serve our model.

week 2 assignment feature engineering

tf.Transform: Going Deeper

week 2 assignment feature engineering

tf.Transform Analyzers

Analyzers make a full pass over the dataset in order to collect constants that is required to do feature engineering. It also express the operations that we are going to do.

week 2 assignment feature engineering

How Transform applies feature transformations

week 2 assignment feature engineering

Benefits of using tf.Transform

  • Emitted tf.Graph holds all necessary constants and transformations
  • Focus on data preprocessing only at training time
  • Works in-line during both training and serving
  • No need for preprocessing code at serving time
  • Consistently applied transformations irrespective of deployment platform

Analyzers framework

week 2 assignment feature engineering

tf.Transform preprocessing_fn

Commonly used imports, hello world with tf.transform.

week 2 assignment feature engineering

Inspect data and prepare metadata

Preprocessing data (transform), running the pipeline.

week 2 assignment feature engineering

  • tf.Transform allows the preprocessing of input data and creating features
  • tf.Transform allows defining pre-processing pipelines and their execution using large-scale data processing frameworks, like Apache Beam.
  • In a TFX pipeline, the Transform component implements feature engineering using TensorFlow Transform

Feature Selection — Feature Spaces

  • N dimensional space defined by your N features
  • Not including the target label

Feature space coverage

  • Same numerical ranges
  • Same classes
  • Similar characteristics for image data
  • Similar vocabulary.

Ensure feature space coverage

  • Data affected by: seasonality, trend, drift
  • Serving data: new values in features and labels
  • Continuous monitoring, key for success!

Feature Selection

  • Feature selection identifies the features that best represent the relationship between the features, and the target that we're trying to predict.
  • Remove features that don't influence the outcome
  • Reduce the size of the feature space
  • Reduces the resource requirements and model complexity

week 2 assignment feature engineering

Unsupervised Feature selection methods

  • Feature-target variable relationship not considered
  • Two features that are highly correlated, you might need only one

Supervised feature selection

  • Uses features-target variable relationship
  • Selects those contributing the most

week 2 assignment feature engineering

Filter methods

  • We remove them.
  • Filter methods suffer from inefficiencies as they need to look at all the possible feature subsets

Popular filter methods:

  • Between features, and between the features and the label
  • Univariate Feature Selection

week 2 assignment feature engineering

Feature comparison statistical tests

  • Pearson's correlation: Linear relationships
  • Kendall Tau Rank Correlation Coefficient: Monotonic relationships & small sample size
  • Spearman's Rank Correlation Coefficient: Monotonic relationships

Other methods:

  • Pearson Correlation (numeric features - numeric target,  exception: when target is 0/1 coded )
  • ANOVA f-test (numeric features - categorical target)
  • Chi-squared (categorical features - categorical target)
  • Mutual information

Determining correlation

Selecting features, univariate feature selection in sklearn.

Sklearn Univarite feature selection routines:

  • SelectKBest
  • SelectPercentile
  • GenericUnivariateSelect

Statistical tests available:

  • Regression: f__regression , mutual_info_regression
  • Classification: chi2 , f_classif , mutual_info_classif

SelectKBest implementation

Wrapper methods.

It's a search method against the features that you have using a model as the measure of their effectiveness

Wrapper methods are based on greedy algorithm and this solutions are slow to compute.

Popular methods include:

Forward Selection

  • Backward Elimiation
  • Recursive Feature Elimination

week 2 assignment feature engineering

  • Iterative, greedy method
  • Starts with 1 feature
  • Evaluate model performance when adding each of the additional features, one at a time.
  • Add next feature that gives the best performance
  • Repeat until there is no improvement

Backward Elimination

  • Start with all features
  • Evaluate model performance when removing each of the included features, one at a time.
  • Remove next feature that gives the best performance

Recursive Feature Elimination (RFE)

  • Select a model to use for evaluating feature importance
  • Select the desired number of features
  • Fit the model
  • Rank features by importance
  • Discard least important features
  • Repeat until the desired number of features remains

Embedded Methods

  • L1 regularization

Feature importance

  • Assigns scores for each feature in data
  • Discard features scores lower by feature importance

Feature importance with Sklearn

  • Feature Importance class is in-built in Tree Based Model (e.g., RandomForestClassifier )
  • Feature importance is available as a property feature_importances_
  • We can then use SelectFromModel to select features from the trained model based on assigned feature importances.

Select features based on importance

Tying together and evaluation.

If you wish to dive more deeply into the topics covered this week, feel free to check out these optional references. You won’t have to read these to complete this week’s practice quizzes.

Mapping raw data into feature

Feature engineering techniques

Embedding projector

Encoding features

  • https://www.tensorflow.org/tfx/guide#tfx_pipelines
  • https://ai.googleblog.com/2017/02/preprocessing-for-machine-learning-with.html

Cookie Policy

We use cookies to operate this website, improve usability, personalize your experience, and improve our marketing. Privacy Policy .

By clicking "Accept" or further use of this website, you agree to allow cookies.

  • Data Science
  • Data Analytics
  • Machine Learning

feature-engineering-workflow.jpg

Intro to Feature Engineering for Machine Learning with Python

Feature Engineering is the process of transforming data to increase the predictive performance of machine learning models.

Introduction

You should already know:.

You should already be comfortable with Python and Pandas . You can learn both interactively at Dataquest .

You will learn:

By the end of this article you should know:

  • Two primary methods for feature engineering
  • How to use Pandas and Numpy to perform several feature engineering tasks in Python
  • How to increase predictive performance of a real dataset using these tasks

Feature engineering is arguably the most important, yet overlooked, skill in predictive modeling. We employ it in our everyday lives without thinking about it!

Let me explain - let's say you're a bartender and a person comes up to you and asks for a vodka tonic. You proceed to ask for ID and you see the person's birthday is "09/12/1998". This information is not inherently meaningful, but you add up the number of years by doing some quick mental math and find out the person is 22 years old (which is above the legal drinking age). What happened there? You took a piece of information ("09/12/1998") and transformed it to become another variable (age) to solve the question you had ("Is this person allowed to drink?").

Feature engineering is exactly this but for machine learning models. We give our model(s) the best possible representation of our data - by transforming and manipulating it - to better predict our outcome of interest. If this isn’t 100% clear now, it will be a lot clearer as we walk through real examples in this article.

Feature engineering is both useful and necessary for the following reasons:

  • Often better predictive accuracy: Feature engineering techniques such as standardization and normalization often lead to better weighting of variables which improves accuracy and sometimes leads to faster convergence.
  • Better interpretability of relationships in the data: When we engineer new features and understand how they relate with our outcome of interest, that opens up our understanding of the data. If we skip the feature engineering step and use complex models (that to a large degree automate feature engineering), we may still achieve a high evaluation score, at the cost of better understanding our data and its relationship with the target variable.

Feature engineering is necessary because most models cannot accept certain data representations. Models like linear regression, for example, cannot handle missing values on their own - they need to be imputed (filled in). We will see examples of this in the next section.

week 2 assignment feature engineering

Every data science pipeline begins with Exploratory Data Analysis (EDA), or the initial analysis of our data. EDA is a crucial pre-cursor step as we get a better sense of what features we need to create/modify. The next step is usually data cleaning/standardization depending on how unstructured or messy the data is.

Feature engineering follows next and we begin that process by evaluating the baseline performance of the data at hand. We then iteratively construct features and continuously evaluate model performance (and compare it with the baseline performance) through a process called feature selection, until we are satisfied with the results.

What this article does and does not cover

Feature engineering is a vast field as there are many domain-specific tangents. This article covers some of the popular techniques employed in handling tabular datasets. We do not cover feature engineering for Natural Language Processing (NLP), image classification, time-series data, etc.

The two approaches to feature engineering

There are two main approaches to feature engineering for most tabular datasets:

  • The checklist approach: using tried and tested methods to construct features.
  • The domain-based approach: incorporating domain knowledge of the dataset’s subject matter into constructing new features.

We will now look at these approaches in detail using real datasets. Note, these examples are quite procedural and focus on showing how you can implement it in Python. The case study following this section will show you a real end-to-end scenario use case of the practices we touch upon in this section.

Before we load the dataset, we import the following dependencies shown below.

We will now demonstrate the checklist approach using a dataset on supermarket sales. The original dataset, and more information about it, is linked here . Note, the dataset has been slightly modified for this tutorial.

The columns are described follows:

  • Invoice ID - Computer generated sales slip invoice identification number
  • Branch - Branch of supercenter (3 branches are available identified by A, B, and C).
  • City - Location of supercenters
  • Customer type - Type of customers, recorded by Members for customers using member card and Normal for without member card
  • Gender - Gender type of customer
  • Product line - General item categorization groups
  • Unit price - Price of each product in $
  • Quantity - Number of products purchased by the customer
  • Tax 5% - 5% tax fee for customer buying
  • Total - Total price including tax
  • Date - Date of purchase
  • Time - Purchase time
  • Payment - Payment used by the customer for their purchase
  • cogs - Cost of goods sold
  • gross margin percentage
  • gross income
  • Rating - Customer stratification rating on their overall shopping experience (On a scale of 1 to 10)

The Checklist Approach

Numeric aggregations.

Numeric aggregation is a common feature engineering approach for longitudinal or panel data - data where subjects are repeated. In our dataset, we have categorical variables with repeated observations (for example, we have multiple entries for each supermarket branch).

Numeric aggregation involves three parameters:

  • Categorical column
  • Numeric column(s) to be aggregated
  • Aggregation type: Mean, median, mode, standard deviation, variance, count etc.

The below code chunk shows three examples of numeric aggregations based on mean, standard deviation and count respectively.

In the following block our three parameters are:

  • Branch – categorical column, which we're grouping by
  • Tax 5%, Unit Price, Product line, and Gender – numeric columns to be aggregated
  • Mean, standard deviation, and count – aggregations to be used on the numeric columns

Below, we group by Branch and perform three statistical aggregations (mean, standard deviation, and count) by transforming the numeric columns of interest. For example, in the first column assignment, we calculate the mean Tax 5% and mean Unit price for every branch, which gives us two new columns - tax_branch_mean and unit_price_mean in the data frame.

And we see the features we've just created below.

Note: since we're viewing a column subset of the full df , it looks like there are duplicate rows. When the rest of the columns are visible you'll notice there aren't duplicate rows, but there are still duplicate values. This is by design.

Choosing numeric aggregation parameters

How do we pick which three parameters to use? Well, that will depend on your domain knowledge and your understanding of the dataset. For example, in this dataset, if you feel like the variation in the average (aggregation type) Rating (numeric variable) based on the Branch (categorical column) is important in predicting gross income (target variable), create the feature! If you feel like the count of the products in the Product Line, by branch, is important in informing gross income, encode that as a feature!

Now if you can test as many combinations of the three parameters - go ahead - as long as you are meticulous at selecting only those features that have enough predictive power i.e. be sure to have a rigorous feature selection process.

Below we can see a couple of the columns we created ( tax_branch_mean and unit_price_mean ). They are aggregations based on the Branch variable.

1003 rows × 5 columns

But why is all of this necessary?

Now before I go on any further, you may be wondering why this is even necessary - aren't good models designed to take all of these aggregations into account? To an extent, yes, but not always. It depends a lot on the size and dimensionality (number of columns) of your dataset. The larger the dataset, the more features (by several orders of magnitude) you can create. When there are too many features, the model has too many competing signals to predict the target variable.

Feature engineering tries to explicitly focus the model's attention on certain features. To summarize, feature engineering is not about creating "new" information, but rather directing and/or focusing the model's attention on certain information, that you as the data scientist judge to be important.

Indicator Variables and Interaction Terms

Following the same pattern of thinking as numeric aggregations, we can construct indicator variables and interaction terms.

Indicator variables only take on the value 0 or 1 to indicate the absence or presence of some information.

For example, below we define an indicator variable unit_price_50 to indicate if the product has a unit price greater than 50. To put it into perspective, think of an e-commerce store having free shipping on all orders above $50; this may be useful information in predicting customer behavior and worth an explicit definition for the model.

Interaction terms are created based on the presence of interaction effects between two or more variables. This is largely driven by domain expertise, although there are statistical tests to help determine them (which is beyond the scope of this article). For example, while free shipping may affect customer rating, free shipping combined with quantity may have a different effect on customer rating, which would be useful to encode (assuming customer rating is the target variable in this case). Below we define the variable unit_price_50 * qty to be exactly that.

We use np.where() to create an indicator variable unit_price_50 that encodes 1 when unit price is above 50 and 0 otherwise.

Numeric Transformations

Some data scientists don't consider numeric transformations to fall under feature engineering. This is because many models, especially the newer ones like tree-based models (decision trees, random forests, etc.), are not impacted by these transformations. In other words, performing these transformations does nothing to improve predictive performance. But for other models such as linear regression, these transformations can make a big difference as they are sensitive to the scale of their variables.

Below we construct a new variable log_cogs to correct for the right skew in the variable cogs . The effect is shown in the plots below the code chunk.

We can also do other transformations such as squaring a variable (shown in the code chunk below) if we believe the relationship between a predictor and target variable is not linear, but quadratic in nature (as a predictor variable changes, target variable changes by an order of 2).

We can even have cubed variables or any n degree polynomial term - it is up to your discretion and domain knowledge.

week 2 assignment feature engineering

As we can see, the log transformation made the distribution of Cost of Goods Sold (cogs) more normally distributed (or less right-skewed). This will benefit models like linear regression as their weights/coefficients won't be strongly influenced by outliers that caused the initial skewness.

As an aside, since we'll be comparing plots next to each other like this many times during the article, we'll just use this helper function from now on:

Numeric Scaling

The columns in a dataset are usually on different scales. In our dataset, for example, 'gross income' and 'Rating' are on very different scales (as seen below). To correct for this we can perform 'normalization' to put both columns on a 0-1 scale.

Why do we do this? When predictor variables are on very different scales, models like linear regression may bias coefficients to variables on a larger scale. So we correct for this by normalizing those numeric variables.

We can normalize a variable in many ways, but the most common way to do it is by using the min-max scaler (shown below the plots). The formula is shown below - for each value in the column, we subtract the minimum value of the column and divide the resulting number with the range of the column ($max - min$).

$$\large X_{scaled} = \frac{X - X_{min}}{X_{max} - X_{min}}$$

We can see the range of gross income and Rating currently in our dataset:

week 2 assignment feature engineering

We can see the difference in scale after applying normalization below.

week 2 assignment feature engineering

Notice the graphs look the same but the scaling on the x-axis is between 0 and 1 now.

Categorical Variable Handling

One-hot encoding.

Machine learning models can only handle numeric variables. Therefore we must encode categorical variables as numeric ones. The easiest way to do this is to 'one-hot-encode' them which means we create $n$ indicator variables for a categorical column with $n$ categories. The below code shows how we can one-hot-encode two categorical columns - Gender and Payment .

But there are problems with this approach. If we have a column with 1000 categories, one-hot-encoding that one column will create 1000 new columns! That's a lot! You're feeding the model way too much information and it naturally is much harder to find patterns. When we have too much dimensionality, our model will take much longer to train and find the optimal predictor weights.

Target encoding

To resolve this, we can use target encoding. Target encoding does not create additional columns. The idea is simple - For each unique category, the average value of the target variable (assuming it is either continuous or binary) is calculated and that becomes the value for the respective category in the categorical column.

Let's look at a simple example first before we apply it to our dataset. We have two columns - the target and the predictor variable. Our goal is to encode the predictor variable (a categorical column) into a numeric variable that can be used by the model. To do this we simply group by the predictor variable to get the mean target value for each predictor category. So for predictor a the encoded value will be the mean of 1 and 5, which is 3. For b it is the mean of 4 and 6, which is 5. Now our categorical column is a numeric column!

Next, we use target encoding in our supermarket dataset. For the example below, we use Product line as the categorical column that is target encoded, and Rating is the target variable, which is a continuous variable.

1003 rows × 3 columns

Target encoding does have its downsides - when a category only appears once, the mean value of that category is the value itself (the mean of one number is the number itself). In general, it isn’t always a good idea to rely on an average when the number of values used in the average is low. It leads to problems with generalizing results in the training dataset to the testing dataset, or data the model isn't trained on.

The takeaway from this section is to attempt one-hot-encoding if dimensionality won't be a problem. If it is a problem, you can use other approaches like target encoding.

Missing Value Handling

Predictive modeling can be thought of as extracting the right signals from a dataset. Missing values can either be a source of signal themselves (when values are not missing at random) or they can be an absence of signal (when values are missing at random).

Note: the data was modified to contain missing values so we could discuss this topic. If you get a fresh copy from Kaggle, it shouldn't have any missing values.

For example, let's say we have some population data and we add a column called has_license indicating whether a person has a driver's license or not. We will notice missing values - a disproportionate amount of them being people under the age of 18. This is a case where values are NOT missing at random. Now if we have a few missing values in the Gender column caused by data entry issues, those values are likely to be missing at random.

Why is this important? If we have missing data that isn't random, we know why the values are missing, and it can be explained by the dataset, we can simply encode that as an indicator variable indicating. This would allow the model to easily figure it out. However, if the explanation for why they are missing is not explained by the dataset, then we are in murky territory and the handling of such a case requires more advanced attention.

When data is missing at random, we have a loss of information, but we hope we can fill in those gaps based on information from other features.

The least we can do is remove the rows with missing data, as most models don't handle missing data. Since columns with too many missing values don't usually provide a helpful signal, we could remove them based on a threshold condition for missingness (shown below).

But before we fill in missing values, it may be useful to first visualize the missing values using Seaborn.

week 2 assignment feature engineering

We see there are missing values in a few columns - Customer type , a categorical column, having the most missing values. Usually, columns with too many missing values don't provide enough signal for prediction - so some practitioners decide to remove those columns by setting a threshold for "missingness". In the below code chunk we set the threshold to be 70% and remove columns and rows that meet these conditions.

I do not recommend this strategy - there may still be useful information in these columns/rows and I would let the feature selection process decide whether or not to keep/remove columns. Regardless, if you simply want to build a quick baseline model you may employ this strategy.

Here's how you might remove missing values for a certain threshold:

Alternatively (preferably), we can impute missing values with a single value such as the mean or median of the column. For categorical columns, we could impute missing values with the mode, or most frequent category in the column.

Now we see no more missing values in the dataset!

week 2 assignment feature engineering

There are more complicated imputation techniques beyond the scope of this article, but this should be enough to get you started. If interested in further exploration in handling missing data, I highly recommend checking out Missing Data by Paul D. Allison.

Date-Time Decomposition

Date-time decomposition is quite simply breaking down a date variable into its constituents. We do this as the model needs to works with numeric variables.

What we've just done is separate out the date column which was in the format "year-month-day" into individual columns, namely year, month, and day. This is information that the model can now use to make predictions, as the new columns are numeric.

Domain-based Approach

There isn't a strict boundary between domain-based and checklist-based approaches to feature engineering. The distinction, I would say, is quite subjective - with domain-based features, you still apply a lot of the techniques we've already discussed, but with a heavy emphasis on domain knowledge.

Domain-based features will involve a lot of ad-hoc metrics like ratios, formulas. etc. We will see examples of this in the case study example below.

Case Study Example - Movie Box Office Data

Now that we have learned several feature engineering techniques, let's apply them!

For our case study, we will be working movie box office data. You can find more information about the dataset by clicking here .

Normally, our first step would be to conduct exploratory data analysis on the dataset, but since this is an article about feature engineering, we will focus on that. Note, a lot of the ideas for feature engineering shown below were inspired by a Kaggle kernel linked here .

5 rows × 23 columns

Filling missing values

First, let's handle missing values. We visualize them using Seaborn and then fill in numeric missing values with the median and categorical missing values with the mode.

week 2 assignment feature engineering

We will fill in the missing numeric variables with the median and the categorical column with the mode. We'll address the categorical missing values after we finish feature engineering other columns (at the very end).

There isn't a hard science to choosing what missing value imputation approach you take. Most practitioners test multiple missing value imputation techniques and decide on the one that gets the best evaluation score.

Decomposing Date

And now we can decompose the date column to its attributes. Note we encode month and day as string variables as there isn't a numeric relationship within them. Days and months have fixed bounds (month doesn't go above 12, the day doesn't go above 31). Day number 10 and 31 are simply different days (think of them as categories).

Let's put year, month, and day into their own columns in the dataframe:

Adjusting budget

Since the budget is highly right-skewed, we take the logarithm of the budget to adjust for it. Note we take the logarithm of the budget + 1 as a lot of movies have a budget of $0 and we cannot take the logarithm of 0.

week 2 assignment feature engineering

Encoding inflation

We know the budget increases yearly to some extent due to inflation. We can encode that using a simple inflation formula as follows:

$$\large InflationBudget_i = Budget_i \big( 1 + \frac{1.8}{100} \times (MaxYear - Year_i) \big) $$

Where $i$ is each row and $MaxYear$ is the maximum year of the dataset (2018 in our case). Here's creating it for our dataframe:

week 2 assignment feature engineering

Other interesting features

Based on domain knowledge, we can create some useful ratio variables as shown below.

Indicator variables

We encode an indicator variable indicating whether a movie has a homepage or not, and whether the movie was in English:

And now we can fill in the missing categorical column values.

We subset the data frame to include only the variables we want.

We one-hot-encode the categorical columns. In our case, we only have one categorical column - month.

Our new data frame looks like this

Now that our dataset is ready, we can go through the process of selecting useful features (feature selection) and make predictions.

To prove feature engineering works, and improves the performance of the model, we can build a simple regression model to predict the revenue of movies.

Normally we pick what features to use via a process called feature selection, however, since this article is focused on feature engineering, we will employ a simple process of selecting features: correlation analysis .

By plotting the correlation matrix (below), we see most of the features we created aren't that predictive of revenue. This is what happens most of the time - you build a ton of features, but only a few end up being useful - but those features that are useful make a difference.

From the plot below we will use has_homepage , budget_year_ratio , and is_english in our model, in addition to features that came before feature engineering - budget , runtime and popularity .

week 2 assignment feature engineering

We will use approximately 80% of the dataset for training the baseline and feature engineered models, and compare their performances on the hidden test set.

The difference is quite stark! The baseline model's — that uses only budget , runtime and popularity as features — predictions are on average $909,146.83 worse than the model where we used our constructed features. We came to that conclusion by comparing the Root Mean Squared Error (RMSE) of both models on the test set.

By using feature engineering, we allowed our model to get a better understanding of our dataset, and therefore make better predictions.

Feature Engineering Pitfalls

Some of the common pitfalls of feature engineering are:

  • Overfitting: When we construct too many features, we risk overfitting the data. This is often referred to as the curse of dimensionality . Briefly, the more features a model has the more flexibility it has to establish relationships between the predictor and the target variable. This may sound like a good thing, but if a model has too much flexibility, it will, in a sense, over-optimize on the data it is trained on. This will result in a high-performance score but will perform poorly on hidden data or new data, as new data will have differences not observed in the training data. We have to be mindful of this and take into consideration out of sample testing when evaluating features (during the feature selection step).
  • Information leakage: If feature engineering is not done properly it could lead to information leakage. This usually involves the construction of new features using the target variable. Feature engineering must always be done independently of the target variable and must only include predictor variables of interest.

The feature engineering mindset

The feature engineering mindset is very experimental. Generally, quantity is valued over quality. Quality comes into play when we deal with feature selection which happens after feature engineering. We may have some direction as to what features may be useful, but we should not let our bias come into play - construct as many relevant features as possible from your data (computation and time permitting of course) and follow it up with a robust feature selection process to weed out bad features.

Course Recommendations

Further learning:.

Many data science and machine learning courses contain sections on feature engineering. Here's a great course to check out:

Feature Engineering with Google Cloud

Get updates in your inbox.

Join over 7,500 data science learners.

Recent articles:

The 6 best python courses for 2024 – ranked by software engineer, best course deals for black friday and cyber monday 2024, sigmoid function, dot product, 7 best artificial intelligence (ai) courses.

Top courses you can take today to begin your journey into the Artificial Intelligence field.

Meet the Authors

BassimEledath-photo.jpg

Bassim is a Data Scientist at NoviSci where he helps solve hard epidemiology problems using numerous statistical tools.

Brendan Martin

Back to blog index

Programming Assignment: Feature Engineering

When creating a post, please add:

  • Week # Weee2

When i submit the assessment it was throughing an error. later while fixing the error and submitting some of the cells were deleted. how can i get those deleted cells? can i restart the whole assessment. I dont have any backup of this assessment

Please follow these steps to refresh your workspace

Thank you so much!!

APDaga DumpBox : The Thirst for Learning...

  • 🌐 All Sites
  • _APDaga DumpBox
  • _APDaga Tech
  • _APDaga Invest
  • _APDaga Videos
  • 🗃️ Categories
  • _Free Tutorials
  • __Python (A to Z)
  • __Internet of Things
  • __Coursera (ML/DL)
  • __HackerRank (SQL)
  • __Interview Q&A
  • _Artificial Intelligence
  • __Machine Learning
  • __Deep Learning
  • _Internet of Things
  • __Raspberry Pi
  • __Coursera MCQs
  • __Linkedin MCQs
  • __Celonis MCQs
  • _Handwriting Analysis
  • __Graphology
  • _Investment Ideas
  • _Open Diary
  • _Troubleshoots
  • _Freescale/NXP
  • 📣 Mega Menu
  • _Logo Maker
  • _Youtube Tumbnail Downloader
  • 🕸️ Sitemap

Coursera: Machine Learning (Week 2) [Assignment Solution] - Andrew NG

Coursera: Machine Learning (Week 2) [Assignment Solution] - Andrew NG

Recommended Machine Learning Courses: Coursera: Machine Learning    Coursera: Deep Learning Specialization Coursera: Machine Learning with Python Coursera: Advanced Machine Learning Specialization Udemy: Machine Learning LinkedIn: Machine Learning Eduonix: Machine Learning edX: Machine Learning Fast.ai: Introduction to Machine Learning for Coders
  • ex1.m - Octave/MATLAB script that steps you through the exercise
  • ex1 multi.m - Octave/MATLAB script for the later parts of the exercise
  • ex1data1.txt - Dataset for linear regression with one variable
  • ex1data2.txt - Dataset for linear regression with multiple variables
  • submit.m - Submission script that sends your solutions to our servers
  • [*] warmUpExercise.m - Simple example function in Octave/MATLAB
  • [*] plotData.m - Function to display the dataset
  • [*] computeCost.m - Function to compute the cost of linear regression
  • [*] gradientDescent.m - Function to run gradient descent
  • [#] computeCostMulti.m - Cost function for multiple variables
  • [#] gradientDescentMulti.m - Gradient descent for multiple variables
  • [#] featureNormalize.m - Function to normalize features
  • [#] normalEqn.m - Function to compute the normal equations
  • Video - YouTube videos featuring Free IOT/ML tutorials

warmUpExercise.m :

Plotdata.m :, computecost.m :, gradientdescent.m :, computecostmulti.m :, gradientdescentmulti.m :, check-out our free tutorials on iot (internet of things):.

featureNormalize.m :

Normaleqn.m :, 163 comments.

week 2 assignment feature engineering

Have you got prediction values as expected?

week 2 assignment feature engineering

Yes. We got prediction values as expected.

My program was successfully run.But after hitting submit and giving the token this error is showing please help ERROR-- % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 1115 100 25 100 1090 12 554 0:00:02 0:00:01 0:00:01 558 error: structure has no member 'message' error: called from submitWithConfiguration at line 35 column 5 submit at line 45 column 3 error: evaluating argument list element number 2 error: called from submitWithConfiguration at line 35 column 5 submit at line 45 column 3 >>

Submitting configuration is generally related that your directory is not right! Or it could also mean you didn't extract the file properly...it did happen with me at times

I have similar problem please tell if you had solved it

week 2 assignment feature engineering

Thanks for your comments. I still have some problems with the solutions, could you help me. In this case is with line 17, J History.... Week 2 function [theta, J_history] = gradientDescent(X, y, theta, alpha, num_iters) %GRADIENTDESCENT Performs gradient descent to learn theta % theta = GRADIENTDESCENT(X, y, theta, alpha, num_iters) updates theta by % taking num_iters gradient steps with learning rate alpha % Initialize some useful values data = load('ex1data1.txt') X = data(:,1) y = data(:,2) m = length(y) x = [ones(m, 1), data(:,1)] theta = zeros(2, 1) iterations = 1500 alpha = 0.01 J = (1 / (2* m) ) * sum(((x* theta)-y).^2) J_history = zeros(num_iters, 1) for iter = 1:num_iters % ====================== YOUR CODE HERE ====================== % Instructions: Perform a single gradient step on the parameter vector % theta. % % Hint: While debugging, it can be useful to print out the values % of the cost function (computeCost) and gradient here. % %error = (X * theta) - y; %temp0 = theta(1) - ((alpha/m) * sum(error .* X(:,1))); %temp1 = theta(2) - ((alpha/m) * sum(error .* X(:,2))); %theta = [temp0; temp1]; % ============================================================ % Save the cost J in every iteration J_history(iter) = computeCost(X, y, theta); end end

change the variable name of iteration.num_iters must be same with declared variable named iteration

Can you elaborate?

Hi Can anyone help me. Just now i started ML. I am using Octave UI where i write the code but i don't know how to submit using UI. Can anybody please help me.

https://www.youtube.com/watch?v=Vsg-cq7169U&feature=youtu.be Watch this video by one of the mentors you will get it .

Thanks Hrishikesh, your comment might help many people.

week 2 assignment feature engineering

>> gradientDescent() error: 'y' undefined near line 7 column 12 error: called from gradientDescent at line 7 column 3 >> computeCost() error: 'y' undefined near line 7 column 12 error: called from computeCost at line 7 column 3 How to correct this?

I tried to re-ran the code and everything worked perfectly fine with me. Please check you code. In the code, you can variable "y" is defined in parameter list itself. So, logically you should not get that error. There must something else you might be missing outside these functions.

I used to get the same error! i realized i have to execute ex1.m file and ex1_multi.m files to correct our code.

Thank you for your response. It will be helpful for many others...

Hey @Akshay...I am facing same problem of 'y' undefined. I tried all the ways suggested by you and by others can you please help me out. Can u please tell which version of octave should i use for windows 8.1 64 bit,presently I am using 4.4.1 may be due to that I am facing this problem,please help

please tell how to execute ex1.m file in online MATLAB please help

computeCost error: 'y' undefined near line 8 column 12 error: called from computeCost at line 8 column 3 gradientDescent error: 'y' undefined near line 7 column 12 error: called from gradientDescent at line 7 column 3 How to correct this?

I tried to re-ran the code and everything worked perfectly fine with me. Please check you code. In the code, you can variable "y" is defined in parameter list itself. So, logically you should not get that error. There must something else you might be missing outside these functions. If you got the solution please confirm here. It will be helpful for others.

Hi, Receiving similar error. Found a solution?

Hello, Got a similar error! found the solution?

Hi Sasank, because small y already is used as a input argument for the mentioned functions. So, you can't get the error like y is undefined. Are you sure you haven't made any mistake like small y and Capital Y ? Please check it and try again.

error: 'X' undefined near line 9 column 10 error: called from featureNormalize at line 9 column 8

anyone have find the solution? getting the same error from the program. i have try number of way but getting he same problem

yes i am also getting the same error

i have the solution : you have to load data x,y >>data = load('ex1data1.txt'); >>X = data(:,1); >>y = data(:,2); >>m = length(y); >>x=[ones(m,1),data(:,1)]; >>theta = zeros(2,1); >>computeCost(X,y,theta) if you have any question Pls contact me in my instagram name t.boy__jr

I was stuck for two months in Week 2 Assignment of Machine Learning . Thanx for your guidance due to which I can now understand coding in a better way and finally I have passed 2nd Week Assignment.

Glad to know that my work helped you in understanding the topic / coding. You can also checkout free IOT tutorials with source codes and demo here: https://www.apdaga.com/search/label/IoT%20%28Internet%20of%20Things%29 Thanks.

I tried to reran the code. But i am getting error this: error: 'num_iters' undefined near line 17 column 19 error: called from gradientDescent at line 17 column 11 how to correct this??

i m also facing the same problem, plz help me to out of the problem

Facing the same problem...

i am also submitting these assignments . i have also done the same . But i dont know where to load data .thus my score is 0. how can i improve? please suggest me.

Refer the forum within the course in Coursera. They have explained the step to submit the assignments in datails.

Hello , In the gradient descent.m file : theta = theta - ((alpha/m) * X'*error); I m confused, why do we take the transpose of X (X'*error) instead of X ? Thanks in advance B

Hi Bruno, I got your confusion, Here X (capital X) represent all the training data together, each row as one training sample, each column as a feature. We want to multiply each training data with corresponding error. To make it happen you have to transpose of X (capital X). if you take x (small x) as single training sample then you don't have to worry about transpose and all. Simply (x * error) will work. Try to do it manually on a notebook. You will understand it.

Hi Akshay Thank you for the quick reply & help ...It s totally clear now, make sense !!! Have a great day Bruno

Good day,please am kind of stuck with week2 programming assignment,and this is under computecost.m I already imported the data and plotted the scatter plot.Do I also after to import the data in computecost.m working area,and and when I just in inputted the code J = (1/(2*m))*(sum(((X*theta)-y).^2)); I got an error message.please how do I fix this. Thanks

What error you got?

plotData error: 'X' undefined near line 20 column 6 error: called from plotData at at line 20 column 1 What is the solution to this?

Hi Amit, As I checked I have used small x as an input argument for plotData function. and in your error there is capital X. Are you sure you haven't made mistake in small and capital X? Please check and try once again.

i can see you have used a X there not x,still showing the error saying not enough input arguments

Hey Akshay, The error 'y' undefined problem do exist, but it is not othe problem only for the code you gave,any solution the internet gives that error.Even running through gui or through command, it says undefined.There is no clear solution for this on the net, I tried adding path too as it was said in the net.Couldnt solve the issue.I have octave 5.1.0

I found the solition for those who were getting u defi ed error. if you are using octave then the file shouldnot first start with function, octave takes it as a function, not as a script. solution add anything to first line example add 1; first line and then start function. If you wanna test your function when you run, first initialize the variables to matrices and respective values. then pass these as parameters to the function.

Thanks Chethan, It will be a great help for others as well.

I didn't understand.can u explain clearly

include two lines of code x=[]; y=[]; This should work

Its still not working. I'm getting: error: 'y' undefined near line 7 column 12 error: called from computeCost at line 7 column 3

Hi Akshay, I am getting error like this m=lenght(y); %number of training example Can you help me Thanks

This comment has been removed by a blog administrator.

Hello, within gradientDescent you use the following code error = (X * theta) - y; theta = theta - ((alpha/m) * X'*error) What is the significance of 'error' in this? Within Ng's lectures I can't remember him making reference to 'error'

Error is similar to that of "cost" (J)

!! Submission failed: 'data' undefined near line 19 column 18 Function: gradientDescent FileName: C:\Users\Gunasekar\Desktop\GNU Oct\machine-learning-ex1\ex1\gradientDescent.m LineNumber: 19 Please correct your code and resubmit. This is my problem how to correct it

Hi, I think you are doing this assignment in Octave and that's why you are facing this issue. Chethan Bhandarkar has provided solution for it. Please check out the comment by Chethan Bhandarkar: https://www.apdaga.com/2018/06/coursera-machine-learning-week-2.html?showComment=1563986935868#c4682866656714070064 Thanks

week 2 assignment feature engineering

Code that is given is not running as always give error 'y' undefined near line 7 column 12 for every code.

did the same as of chethan said but still the issue is not resolved getting the same error y not defined

@Shilp, I think, You should raise your concern on Coursera forum.

>> gradientDescent() error: 'y' undefined near line 7 column 12 error: called from gradientDescent at line 7 column 3 >> computeCost() error: 'y' undefined near line 7 column 12 error: called from computeCost at line 7 column 3 i am getting this kind of error how to solve this

hey i think the errors related to undefined variables are due to people not passing arguments while calling the func from octave window. Can you post an example of command to run computeCost with arguments

the Predicted prices using normal equations and gradient descent are not equals (NE price= 293081.464335 and GD price=289314.62034) is it correct ?

I had the similar issue. For persons who would have a same situation later, please change your alpha to 1.0 and your iterations to 100.

For compute.m function, i am continuosly getting below error message: Error in computeCost (line 31) J = (1/(2*m))*sum(((X*theta)-y).^2);

What error you are getting exactly?

what is the predicted value of house..mine it is getting $0000.00 with huge theta value how is that possible?

You have to modify the value of price variable in the ex1_multi file

Ok so for the people facing problem regarding y is undefined error.....you can directly submit the program it tests ex1.m file as a whole and it is compiled successfully and gives the correct answer

week 2 assignment feature engineering

how can i directly submit the ex1.m file?

plotData Not enough input arguments. Error in plotData (line 19) plot(x, y, 'rx', 'MarkerSize', 10); % Plot the data I got this error. how can I solve this?

try ylabel('Profit in $10,000s'); % Set the y-axis label xlabel('Population of City in 10,000s'); % Set the x-axis label plot(x, y, 'rx', 'MarkerSize', 10); % Plot the data

not enough input arguments. Error in computeCost (line 7) m = length(y); % number of training examples

I got the same error. have you found out a solution yet?

Hi, I am getting the same error and the program doesn't give the solution. Please advise.

Having problems with nearly everyone of these solutions. I am 12 and learning machine learning for the first time and having troubles referring to this as i find these solutions do not work. Any help?

Hello I am stuck in WK2 PlotData I keep getting errors: >> Qt terminal communication error: select() error 9 Bad file descriptor like that one or error: /Users/a69561/Desktop/machine-learning-ex1/ex1/plotData.m at line 19, column 3 Can somebody help me ??

thank you for the solution but i m still getting 2 different values of price of house( with normal equation and gradient descent method)

hi i have same problem as undefined. Please help me, please. I am using in the octave. Any other way to submit the programming assignment. Please help?

Whats is your leaning rate alpha and number of iterations?

I have provided only function definitions here. You can find the parameter (alpha, num of iterations) values in execution section of your assignment.

In Linear regression with multiple variables by 1st method ( gradientDescent method) the price value of the house was different whn compared to the 2nd method(Normal Equations).still not able to match the values of both the methods ? Note : i have copied all the code as per your guidance.

hi, thanks for all your help. But I have some problem in submission. When I finished all work, I tried to submit all in once and got this: >> submit Warning: Function Warning: Name is nonexistent or not a directory: /MATLAB Drive/./lib > In path (line 109) In addpath (line 86) In addpath (line 47) In submit (line 2) Warning: Function Warning: Name is nonexistent or not a directory: /MATLAB Drive/./lib/jsonlab > In path (line 109) In addpath (line 86) In addpath (line 47) In submitWithConfiguration (line 2) In submit (line 45) 'parts' requires one of the following: Automated Driving Toolbox Navigation Toolbox Robotics System Toolbox Sensor Fusion and Tracking Toolbox Error in submitWithConfiguration (line 4) parts = parts(conf); Error in submit (line 45) submitWithConfiguration(conf); >> submit >> submitWithConfiguration Warning: Function Warning: Name is nonexistent or not a directory: /MATLAB Drive/./lib/jsonlab > In path (line 109) In addpath (line 86) In addpath (line 47) In submitWithConfiguration (line 2) 'parts' requires one of the following: Automated Driving Toolbox Navigation Toolbox Robotics System Toolbox Sensor Fusion and Tracking Toolbox Error in submitWithConfiguration (line 4) parts = parts(conf);

Check if your are in the same directory ex1 folder and to submit the solution use ''submit()'' not submit add parenthesis

This is happening because variable parts has the same name as of parts(conf) function in file ex1/lib/submitWithConfiguration.m Make the following changes to resolve this : Line 4 - parts_1 = parts(conf); Line 92 - function [parts_1] = parts(conf) Line 93 - parts_1 = {}; Line 98 - parts_1{end + 1} = part; Basically, I've just renamed the variables. Same thing is happening with one more variable, so make the following changes : Line 66 - submissionUrl_1 = submissionUrl(); Line 68 - responseBody = getResponse(submissionUrl_1, body); Line 22: response = submitParts(conf, email, token, parts_1); Line 37: showFeedback(parts_1, response); This worked for me.

after changing my variables names also ,i'm getting error in calling function parts: !! Submission failed: Not enough input arguments. Function: parts FileName: C:\Users\Avanthi\Documents\ML\exp-2\lib\submitWithConfiguration.m LineNumber: 94 can someone help me with this?

Hello Akshay, In computeCost, how to declate or compute 'theta' because, it's giving an error - 'theta' undefined.

error: structure has no member 'message' error: called from submitWithConfiguration at line 35 column 5 submit at line 45 column 3 error: evaluating argument list element number 2 error: called from submitWithConfiguration at line 35 column 5 submit at line 45 column 3 how to solve this

Hello Akshay Daga (APDaga, Very glad to come across your guide on ML by Andred NG. I been stuck months, could complete the Programming Assisgment. Have done up to computeCost but got stuck at gradientDescent Below is the error. I don't want to drop out of this course please help me out. "error: 'num_iters' undefined near line 1 column 58" Here is my update h=(theta(1)+ theta(2)*X)'; theta(1) = theta(1) - alpha * (1/m) * theta(1) + theta(2)*X'* X(:, 1); theta(2) = theta(2) - alpha * (1/m) * htheta(1) + theta(2)*X' * X(:, 2); I count on your assistance.

gradientDescent() error: 'y' undefined near line 7 column 14 error: evaluating argument list element number 1 error: called from: error: /Users/apple/Downloads/machine-learning-ex1/ex1/gradientDescent.m at line 7, column 5 I am getting this error for both gradient descent and computeCost. plz helpme out

function [theta, J_history] = gradientDescent(X, y, theta, alpha, iterations) %GRADIENTDESCENT Performs gradient descent to learn theta % theta = GRADIENTDESCENT(X, y, theta, alpha, num_iters) updates theta by % taking num_iters gradient steps with learning rate alpha % Initialize some useful values m = length(y); % number of training examples h=X*theta; error=(h-y); theta_c=(alpha/m)*(sum((error)*X')); theta=theta-theta_c; J_history = zeros(num_iters, 1); for iter = 1:iterations % ====================== YOUR CODE HERE ====================== % Instructions: Perform a single gradient step on the parameter vector % theta. % % Hint: While debugging, it can be useful to print out the values % of the cost function (computeCost) and gradient here. % % ============================================================ % Save the cost J in every iteration J_history(iter) = computeCost(X, y, theta); end end while running on octave it's showing Running Gradient Descent ... error: gradientDescent: operator *: nonconformant arguments (op1 is 97x1, op2 is 2x97) error: called from gradientDescent at line 10 column 8 ex1 at line 77 column 7 where is the problem???

i got an error in computeCost.m as following: max_recursion_depth reached. How to solve this?

i got an error as: error: computeCost: operator /: nonconformant arguments (op1 is 1x1, op2 is 1x2) How to solve this?

I can't see any variable used in codes as op1 or op2. Please check once again where did you get those variables from?

Hi, great guidance. Only, I still have the confusion how single parameter costfunction and multi parameter costfunction codes are same? (same confusion for both gradientdescent (single and multi).Am I missing something?

single parameter costfunction is as follows: h = X*theta; temp = 0; for i=1:m temp = temp + (h(i) - y(i))^2; end J = (1/(2*m)) * temp; Which doesn't work for multi parameter costfunction. But, I have also provided vectorized implementation. (It is generic code and works for both single as well as multi parameters).

Hello, I am getting x is undefined while submitting plotData in assignmnet2 several times I checked But I am getting the same error will u please help me?

function plotData(x, y) plot(x, y, 'rx', 'MarkerSize', 10); ylabel('Profit in $10,000s'); xlabel('Population of City in 10,000s'); Always I am getting x is undefined.I cant able to understand where the error is plzz help me?? figure;

function plotData(x, y) plot(x, y, 'rx', 'MarkerSize', 10); ylabel('Profit in $10,000s'); xlabel('Population of City in 10,000s'); figure; Always I am getting x is undefined.I cant able to understand where the error is plzz help me??

While doing in matlab also it is saying error in submitwithconfiguration in submit.m file accutally it was defaultly givern by them but why it is show error there???

While doing in matlab it is saying error in submitwithconfiguration in submit.m file accutally it was defaultly givern by them but why it is show error there???

Still the same problem with undefined y (small letter) using Octave 5.2.0 adding anything as first line didn't help What could I do else? Has somebody working version. I got stuck in this point

instead of running codes individually, run 'ex1' after completing all the problems....then it will not show any error

Hi.. I am using MATLAB R2015a version offline and getting error submitwithconfiguration(line 158).How to rectify this error??

Raise this concern in Coursera forum.

if you implement featureNormalize this way, it gives dimensions disagreement error so i suggest it would be better to do it in the following way; mu = ones(size(X,1),1)* mean(X); sigma = ones(size(X,1),1)* std(X); X_norm = (X - mu)./(sigma); P.S: it gives me accurate results

I entered submit () ,but I geeting error so pls help to how to submit my assignment

I think you should raise this concern to Coursera forum.

try just submit without the brackets.

ur code is not working when i use it

Sorry to know that. But I was working 100% for me and some others as well.

num_iters not defined error.. Plz help

just got the answer for num_iters not defined...You have to fix line 59 in submit.m

I have a problem running the below line of code: (X * theta) - y; it gives error: operator *: nonconformant arguments (op1 is 97x1, op2 is 2x1) I can understand because X is a 97x1 matrix and cannot be multiplied with a 2x1 matrix. Any ideas?

I get the below error when executing ex1 for testing the gradientDescent function: error: computeCost: operator *: nonconformant arguments (op1 is 97x2, op2 is 194x1) error: called from computeCost at line 15 column 2 gradientDescent at line 36 column 21 ex1 at line 77 column 7 My gradientDescent function has the below lines of code as per the tutorial. temp0 = theta(1) - ((alpha/m) * sum((X * theta) - y) .* X(:,1)); temp1 = theta(2) - ((alpha/m) * sum((X * theta) - y) .* X(:,2)); theta = [temp0; temp1]; My computeCost function has this line of code on line number 15: J=1/(2*m)*sum(((X*theta)-y).^2) NB: surprisingly I can run the gradientDescent lines individually on octave command without problems

I also had this problem, I realised that it is to do with the brackets. if you compare your code to mine; t0 = theta(1) - ((alpha/m) * sum(((X * theta) - y).* X(:,1))); t1 = theta(2) - ((alpha/m) * sum(((X * theta) - y).* X(:,2))); theta = [t0; t1]; you can see that you are missing 2 brackets on each side. this dimensions may be messed up due to wrong operations

week 2 assignment feature engineering

Hey, how do you calculate the value of theta?

The values of theta1 and theta2 are initially set to 0, theta = zeros(2,1)

Getting an error, theta is undefined...

I get the below error when executing ex1 for Submitting the gradient Descent: >> submit 'parts' requires one of the following: Automated Driving Toolbox Navigation Toolbox Robotics System Toolbox Sensor Fusion and Tracking Toolbox Error in submitWithConfiguration (line 4) parts = parts(conf); Error in submit (line 45) submitWithConfiguration(conf);

did you get this answer? , I see this error

I have the same error

some of these answers are incorrect. for example the feature normalization question is wrong. when you calculate X-u /sigma X and u are different dimensions so it doesn't work.

Thanks for the feedback. All these answers worked 100% for me. and they are working fine for many of others as well (you can get idea from comments.) But coursera keeps on updating the assignments time to time. So, You might be right in that case. Please use above codes just for reference and then solve your assignment on your own. Thanks

hello brother, can you please briefly explain the working in these two lines of GD error = (X * theta) - y; theta = theta - ((alpha/m) * X'*error)

How can I solve this problem?? submit 'parts' requires one of the following: Automated Driving Toolbox Navigation Toolbox Robotics System Toolbox Sensor Fusion and Tracking Toolbox Error in submitWithConfiguration (line 4) parts = parts(conf); Error in submit (line 45) submitWithConfiguration(conf);

same problem here

Hi, when I run my code, the predicted price of the house (in ex1_multi.m), it says 0.0000. How can I fix that?

>> [Xn mu sigma] = featureNormalize([1 ; 2 ; 3]) error: Invalid call to std. Correct usage is: -- std (X) -- std (X, OPT) -- std (X, OPT, DIM) error: called from print_usage at line 91 column 5 std at line 69 column 5 featureNormalize at line 32 column 8 >> Even after I am doing it the right way i hope: ''' mu = mean(X); sigma = std(X, 1); X_norm = (X - mu) ./ std; ''' Anyone any idea, why i am facing this error?

I tried simply this also: sigma = std(X);

>> submit() 'parts' requires one of the following: Automated Driving Toolbox Navigation Toolbox Robotics System Toolbox Sensor Fusion and Tracking Toolbox Error in submitWithConfiguration (line 4) parts = parts(conf); Error in submit (line 45) submitWithConfiguration(conf);

This is happening because variable parts has the same name as of parts(conf) function in file ex1/lib/submitWithConfiguration.m Make the following changes to resolve this : Line 4 - parts_1 = parts(conf); Line 92 - function [parts_1] = parts(conf) Line 93 - parts_1 = {}; Line 98 - parts_1{end + 1} = part; Basically, I've just renamed the variables. Same thing is happening with one more variable, so make the following changes : Line 66 - submissionUrl_1 = submissionUrl(); Line 68 - responseBody = getResponse(submissionUrl_1, body); Line 22: response = submitParts(conf, email, token, parts_1); Line 37: showFeedback(parts_1, response); This worked for me

which is better to use to submit assignments ( Octave or Matlab)... The solutions that have been provided are for Matlab or Octave?

I have provided solution in MATLAB but It works in Octave as well.

hi I don't understand why X*theta . I mean theta is a 2X1 vector right? I understand the formula, but i get confused in this exercise.

I figure it out because I thought X is a 97x1 vector. I have another question. Is this a gradient descent with one variable? I thought it is two variables? Does the theta0 count 1 variable?

%%%%%%%% CORRECT %%%%%%%%%% error = (X * theta) - y; theta = theta - ((alpha/m) * X'*error); %%%%%%%%%%%%%%%%%%%%%%%%%%% WHY IS NOT HERE "SUM" USED? THAAAAAAAANKS!!!

Here we have used a Matrix multiplication (Which is nothing but Sum of product operation). Matrix multiplication already consist of sum operation.

OWWWWWWWW!!! so the other one is (dot product). Thank you so much! You are awesome !

J = (1/(2*m))*sum(((X*theta)-y).^2); Can you please break it down, then we used SUM here. Thanks in advance !!

and not in the above one (theta = theta - ((alpha/m) * X'*error))? Like I could see with the dimensions that, sum is not required. But I want to know how should I think/(the intuition) or approach to this idea that I need or dnt need sum.

"Matrix multiplication (Which is nothing but Sum of product operation)." then why using SUM here, J = (1/(2*m))*sum(((X*theta)-y).^2);

PLEASE PLEASE HELP. I will be ever grateful to you. And will pray for you.

Don't get confused with normal and vectorized implementation. > "sum" in vectorized implementation represents summation in the given formula. > In normal implementation, "temp = temp + formula" is equivalent to that of "sum" in vectorized implementation. Please look at below code, (both codes achieves same result) compare them and try to understand. %%%%%%%%%%%%% CORRECT %%%%%%%%% % h = X*theta; % temp = 0; % for i=1:m % temp = temp + (h(i) - y(i))^2; % end % J = (1/(2*m)) * temp; %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%% CORRECT: Vectorized Implementation %%%%%%%%% J = (1/(2*m))*sum(((X*theta)-y).^2); %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

My Goodness ! Thank you so much! You are awesome ! You have explained it very nicely! Became your fan. God bless you! Will be following your Blog.

x = data(:, 1); y = data(:, 2); m = length(y); plot(x, y, 'rx', 'MarkerSize', 10); ylabel('Profit in $10,000s'); xlabel('Population of City in 10,000s'); X = [ones(m, 1), data(:,1)]; theta = zeros(2, 1); iterations = 1500; alpha = 0.01; temp = 0; for i=1:m temp = temp + (h(i) - y(i))^2; end J = (1/(2*m)) * temp; >> J J = 32.073 the answer is good But when execute the submit: !! Submission failed: operator *: nonconformant arguments (op1 is 97x1, op2 is 2x1) Function: computeCost FileName: LineNumber: 65 Help me please

WHY IT IS SHOWING "This item will be unlocked when the session begins." ON THE QUIZ SECTION.

managed to run every other thing corectly in octave but got a submission error.please help( !! Submission failed: parse error near line 25 of file C:\Users\user\Desktop\ml-class-ex1\computeCostMulti.m) syntax error >>> j= (1/(2*m)) *sum(((X*theta)-y.^2); ^ Function: submit>output FileName: C:\Users\user\Desktop\ml-class-ex1\submit.m LineNumber: 63 Please correct your code and resubmit.

what van i do to fix this problem?? Please help me > submit Unrecognized function or variable 'parts'. Error in submitWithConfiguration (line 4) parts = parts(conf); Error in submit (line 45) submitWithConfiguration(conf);

i have some issues while uploading codes. i do run it without any error but at the end the score still shows 0/10 for 3rd questions and so on. Along with this the same result reflects on my course id. Please help

It should not happen. you might be missing something simple in your process. have you raised this concern on coursera forum. please try it there, you will get the resolution for sure.

I have error at m= length(y) This error is occur

Thank you Akshay for your helping lots of people for years!

Thank you for your kind words.

Hi Akshay i have question about gradient descent with multiple variables. Q(0) Q(1) While doing the Gradient descent we were using X as [1 random number] [1 random number] [1 random number] we were using 1 for Q(0). My question is for doing multiple varient Gradient Descent do we use X matrix as a Q(0) Q(1) Q(2).... X = [1 random random ...] because in coursera as an example they took as; X = data(:, 1:2); y = data(:, 3); Don't they need to add 1 numbers in X for to represent Q(0)?

Once you split the input(X) and output(y) from the raw data. The below line add the 1 in the input =(X) as mentioned the theory. x = [ones(m, 1), data(:,1)] Above line will take care of adding one in the input(X). Please check the code, it is already present in the code.

i have a issue >> submitWithConfiguration error: 'conf' undefined near line 4, column 4 error: called from submitWithConfiguration at line 4 column 10

i m facing this error while submitting my assignment.. unexpected error: Index in position 1 exceeds array bounds.. please need help ...how can i fix it ?

I copied exactly all the same code as author. The program run successfully but the result of gradient descent was crazy large(incorrect much bigger then expected). I was stuck in this part for a long time. Could anyone help? Thank you very much. My Octave Version is 6.3.0. Here is the result of my output Loading data ... First 10 examples from the dataset: x = [2104 3], y = 399900 x = [1600 3], y = 329900 x = [2400 3], y = 369000 x = [1416 2], y = 232000 x = [3000 4], y = 539900 x = [1985 4], y = 299900 x = [1534 3], y = 314900 x = [1427 3], y = 198999 x = [1380 3], y = 212000 x = [1494 3], y = 242500 Program paused. Press enter to continue. Normalizing Features ... Running gradient descent ... Theta computed from gradient descent: 340412.659574 110631.050279 -6649.474271 Predicted price of a 1650 sq-ft, 3 br house (using gradient descent): $182861697.196858 Program paused. Press enter to continue. Solving with normal equations... Theta computed from the normal equations: 89597.909543 139.210674 -8738.019112 Predicted price of a 1650 sq-ft, 3 br house (using normal equations): $293081.464335

Facing same issue..any solution to this?

!! Submission failed: unexpected error: Undefined function 'makeValidFieldName' for input arguments of type 'char'. !! Please try again later.

Facing the same issue, any updates ?

concerning the code on gradient Descent, please am yet to undrstand how the iterations work, am i to keep running the gradient descent and manually updating theta myself till i get to the value of theta with the lowest cost. please expanciate more on this it will be very helpful.

>> normalEqn error: 'X' undefined near line 7, column 22 error: called from normalEqn at line 7 column 9 I am getting this error in normaleqn

I want to thank the writer for their sincere efforts. Best Data Science Institute In Chennai With Placement

Our website uses cookies to improve your experience. Learn more

Contact form

week 2 assignment feature engineering

Executive Programs

week 2 assignment feature engineering

Student Reviews

For Business

Academic Training

Informative Articles

We are Hiring!

week 2 assignment feature engineering

All Courses

Choose a category

logo

Week 10 - Challenge 2- Attachment Feature Creation (Dog House)

Attachment Feature Creation - Dog House AIM Create the Dog House for the Door Trim considering the design rules. INTRODUCTION DOG HOUSE The dog house is an engineering feature used in plastic design. Dog houses are used as a support feature. Sometimes other engineering features like snaps, locators, and screw bosses are…

week 2 assignment feature engineering

Nithyaraj Veerapandian

updated on 23 Jul 2023

comment

Project Details

Leave a comment

Thanks for choosing to leave a comment. Please keep in mind that all the comments are moderated as per our comment policy, and your email will not be published for privacy reasons. Please leave a personal & meaningful conversation.

Other comments...

Read more Projects by Nithyaraj Veerapandian (14)

Design of backdoor

CAR BACK DOOR DESIGN Aim: To design a BIW of the Back Door from the given Class A Surfacing. The inner Panel is designed and assembled with Hinge Reinforcements, Gas Stay Reinforcements, Wiper Motor Reinforcements, and Latch Reinforcements. Introduction: In this project, I have designed a back door for a hatchback…

calendar

28 Apr 2024 05:23 AM IST

Roof Design

ROOF DESIGN WITH REINFORCEMENTS, ANALYSING HEAT DISTORTION CRITERIA & SNOW LOAD PREDICTION OBJECTIVE: The objective of this project is to design a BIW Roof of a car with Front Roof Rail, Bow Roof Rail, Center Roof Rail, and Rear Roof Rail by following engineering standards. The roof must be subjected to a Draft Analysis…

21 Apr 2024 01:26 PM IST

Fender Design

DESIGN A FENDER USING NX CAD: OBJECTIVE: The objective is to Design a Front Fender from the given Class A Surfacing with Drip Area, Bumper Mount, Cowl Mount, Sill Mount, and A-Pillar Mount. INTRODUCTION: Fender is the US English term for the part of an automobile (vehicle body) that frames a wheel well (the fender…

14 Apr 2024 11:15 AM IST

Section Modulus calculation and optimization

Aim: To find the section modulus of the hood and optimization to improve the overall section modulus.   Objective: Use the section from your Hood design and calculate the section modulus using the formula S= I/y.   Section Modulus: The section modulus (S) is geometry property of the cross section used for designing beams…

09 Apr 2024 04:38 AM IST

Hood design-Week 2

Hood design-Week 2 AIM: To design the Hood or Bonet of the front driving car by using the master section and standard procedure which is used in the OEM Manufacturer. GIVEN MASTER SECTION: HOOD or BONET: Hood is the component which is placed over the engine compartment. The design of the hood is not only to enhance the…

08 Apr 2024 12:48 AM IST

Week 12- Final project

AIM: To develop the Door Trim panel OBJECTIVE: Creation of CAD model considering the following inputs as like in OEM and tier1 companies Class A surface Nominal thickness: Refer to master sections Attachment strategy :  Find an attachment strategy for the door panel complete work. Use Push pin with Dog house for Lower…

27 Feb 2024 11:41 AM IST

Week 11 - Project - A pillar Design with Master Section

Week 11 - Project - A pillar Design with Master Section AIM: To create the A Pillar Design with the master section given as the intersection sketch with the Class A surface and create the tooling axis and perform the Draft analysis for the final solid part. Class A surface: A visible surface is a Class-A surface. It provides…

03 Feb 2024 10:14 PM IST

Week 10 - Challenge 3 - Door Trim Lower with Engineering Features

Week 10 - Challenge 3 - Door Trim Lower with Engineering Features Objective: The main aim of this challenge is to create a solid body of 'Door Trim Lower' with the help of the closed surface method by using the 'Generative shape design' workbench in CATIA V5. Performing the 'Draft Analysis' on the solid part body…

16 Jan 2024 08:41 AM IST

23 Jul 2023 03:34 PM IST

Week 10 - Challenge 1- Attachment Feature Creation (Ribs & Screw Boss)

Attachment Feature Creation (Ribs & Screw Boss) Design of the door handle with engineering features: AIM: To create the door handle component by generating the class B and class c surfaces with the aid of class A provided and creating the B side features including the ribs and the screw bosses by following the…

16 Jul 2023 06:33 PM IST

Week 9 - Challenge 3 - Bumper

BUMPER DESIGN AIM: To create the bumper plastic component through the given Class A Surface and to perform draft analysis on the plastic component. OBJECTIVES: Create the Tooling Axis Create the Class B and C Surface Create the Close Surface Perform Draft Analysis SPECIFICATIONS: Bumper Thickness = 3mm Minimum Draft Angle…

02 Jul 2023 06:20 AM IST

Week 9 - Challenge 2 - Switch Bezel Design

1. Tooling axis creation and draft analysis for switch bezel class A surface Procedure for tooling axis creation: First the connectivity between the surfaces to be checked, by using the join option see how the surface is connected and ensure that there are no surface gaps in between the component.  By using multiple…

11 Jun 2023 06:49 PM IST

Week 9 - Challenge 1 - Base Bracket Design

1. Screenshots of the Draft analysis report for both the given Class-A surface and the final plastic component Draft analysis for given Class-A surface Draft analysis for final plastic component 2. Screenshots of the solid model various orientations (Isometric, Front, Side, and Top View) Isometric view Front…

11 Jun 2023 07:17 AM IST

Week 7 Challenge - Assembly Workbench

CATIA Assembly Model drawing 1 Model Drawing 2   Go to File- New - Product and name as skeleton then click OK. To make the publication a feature like a point, axis system, or line. Open the toolbars, Product Structure Tools, and Constraints. Use the button Existing Component from the Product Structure…

06 May 2023 07:40 PM IST

  • Share full article

For more audio journalism and storytelling, download New York Times Audio , a new iOS app available for news subscribers.

The Daily logo

  • May 15, 2024   •   25:48 The Possible Collapse of the U.S. Home Insurance System
  • May 14, 2024   •   35:20 Voters Want Change. In Our Poll, They See It in Trump.
  • May 13, 2024   •   27:46 How Biden Adopted Trump’s Trade War With China
  • May 10, 2024   •   27:42 Stormy Daniels Takes the Stand
  • May 9, 2024   •   34:42 One Strongman, One Billion Voters, and the Future of India
  • May 8, 2024   •   28:28 A Plan to Remake the Middle East
  • May 7, 2024   •   27:43 How Changing Ocean Temperatures Could Upend Life on Earth
  • May 6, 2024   •   29:23 R.F.K. Jr.’s Battle to Get on the Ballot
  • May 3, 2024   •   25:33 The Protesters and the President
  • May 2, 2024   •   29:13 Biden Loosens Up on Weed
  • May 1, 2024   •   35:16 The New Abortion Fight Before the Supreme Court
  • April 30, 2024   •   27:40 The Secret Push That Could Ban TikTok

The Possible Collapse of the U.S. Home Insurance System

A times investigation found climate change may now be a concern for every homeowner in the country..

Hosted by Sabrina Tavernise

Featuring Christopher Flavelle

Produced by Nina Feldman ,  Shannon M. Lin and Jessica Cheung

Edited by MJ Davis Lin

With Michael Benoist

Original music by Dan Powell ,  Marion Lozano and Rowan Niemisto

Engineered by Alyssa Moxley

Listen and follow The Daily Apple Podcasts | Spotify | Amazon Music | YouTube

Across the United States, more frequent extreme weather is starting to cause the home insurance market to buckle, even for those who have paid their premiums dutifully year after year.

Christopher Flavelle, a climate reporter, discusses a Times investigation into one of the most consequential effects of the changes.

On today’s episode

week 2 assignment feature engineering

Christopher Flavelle , a climate change reporter for The New York Times.

A man in glasses, dressed in black, leans against the porch in his home on a bright day.

Background reading

As American insurers bleed cash from climate shocks , homeowners lose.

See how the home insurance crunch affects the market in each state .

Here are four takeaways from The Times’s investigation.

There are a lot of ways to listen to The Daily. Here’s how.

We aim to make transcripts available the next workday after an episode’s publication. You can find them at the top of the page.

Christopher Flavelle contributed reporting.

The Daily is made by Rachel Quester, Lynsea Garrison, Clare Toeniskoetter, Paige Cowett, Michael Simon Johnson, Brad Fisher, Chris Wood, Jessica Cheung, Stella Tan, Alexandra Leigh Young, Lisa Chow, Eric Krupke, Marc Georges, Luke Vander Ploeg, M.J. Davis Lin, Dan Powell, Sydney Harper, Mike Benoist, Liz O. Baylen, Asthaa Chaturvedi, Rachelle Bonja, Diana Nguyen, Marion Lozano, Corey Schreppel, Rob Szypko, Elisheba Ittoop, Mooj Zadie, Patricia Willens, Rowan Niemisto, Jody Becker, Rikki Novetsky, John Ketchum, Nina Feldman, Will Reid, Carlos Prieto, Ben Calhoun, Susan Lee, Lexie Diao, Mary Wilson, Alex Stern, Dan Farrell, Sophia Lanman, Shannon Lin, Diane Wong, Devon Taylor, Alyssa Moxley, Summer Thomad, Olivia Natt, Daniel Ramirez and Brendan Klinkenberg.

Our theme music is by Jim Brunberg and Ben Landsverk of Wonderly. Special thanks to Sam Dolnick, Paula Szuchman, Lisa Tobin, Larissa Anderson, Julia Simon, Sofia Milan, Mahima Chablani, Elizabeth Davis-Moorer, Jeffrey Miranda, Renan Borelli, Maddy Masiello, Isabella Anderson and Nina Lassam.

Christopher Flavelle is a Times reporter who writes about how the United States is trying to adapt to the effects of climate change. More about Christopher Flavelle

Advertisement

Follow Puck Worlds online:

  • Follow Puck Worlds on Twitter

Site search

Filed under:

  • Kontinental Hockey League

Gagarin Cup Preview: Atlant vs. Salavat Yulaev

Share this story.

  • Share this on Facebook
  • Share this on Twitter
  • Share this on Reddit
  • Share All sharing options

Share All sharing options for: Gagarin Cup Preview: Atlant vs. Salavat Yulaev

Gagarin cup (khl) finals:  atlant moscow oblast vs. salavat yulaev ufa.

Much like the Elitserien Finals, we have a bit of an offense vs. defense match-up in this league Final.  While Ufa let their star top line of Alexander Radulov, Patrick Thoresen and Igor Grigorenko loose on the KHL's Western Conference, Mytischi played a more conservative style, relying on veterans such as former NHLers Jan Bulis, Oleg Petrov, and Jaroslav Obsut.  Just reaching the Finals is a testament to Atlant's disciplined style of play, as they had to knock off much more high profile teams from Yaroslavl and St. Petersburg to do so.  But while they did finish 8th in the league in points, they haven't seen the likes of Ufa, who finished 2nd. 

This series will be a challenge for the underdog, because unlike some of the other KHL teams, Ufa's top players are generally younger and in their prime.  Only Proshkin amongst regular blueliners is over 30, with the work being shared by Kirill Koltsov (28), Andrei Kuteikin (26), Miroslav Blatak (28), Maxim Kondratiev (28) and Dmitri Kalinin (30).  Oleg Tverdovsky hasn't played a lot in the playoffs to date.  Up front, while led by a fairly young top line (24-27), Ufa does have a lot of veterans in support roles:  Vyacheslav Kozlov , Viktor Kozlov , Vladimir Antipov, Sergei Zinovyev and Petr Schastlivy are all over 30.  In fact, the names of all their forwards are familiar to international and NHL fans:  Robert Nilsson , Alexander Svitov, Oleg Saprykin and Jakub Klepis round out the group, all former NHL players.

For Atlant, their veteran roster, with only one of their top six D under the age of 30 (and no top forwards under 30, either), this might be their one shot at a championship.  The team has never won either a Russian Superleague title or the Gagarin Cup, and for players like former NHLer Oleg Petrov, this is probably the last shot at the KHL's top prize.  The team got three extra days rest by winning their Conference Final in six games, and they probably needed to use it.  Atlant does have younger regulars on their roster, but they generally only play a few shifts per game, if that. 

The low event style of game for Atlant probably suits them well, but I don't know how they can manage to keep up against Ufa's speed, skill, and depth.  There is no advantage to be seen in goal, with Erik Ersberg and Konstantin Barulin posting almost identical numbers, and even in terms of recent playoff experience Ufa has them beat.  Luckily for Atlant, Ufa isn't that far away from the Moscow region, so travel shouldn't play a major role. 

I'm predicting that Ufa, winners of the last Superleague title back in 2008, will become the second team to win the Gagarin Cup, and will prevail in five games.  They have a seriously well built team that would honestly compete in the NHL.  They represent the potential of the league, while Atlant represents closer to the reality, as a team full of players who played themselves out of the NHL. 

  • Atlant @ Ufa, Friday Apr 8 (3:00 PM CET/10:00 PM EST)
  • Atlant @ Ufa, Sunday Apr 10 (1:00 PM CET/8:00 AM EST)
  • Ufa @ Atlant, Tuesday Apr 12 (5:30 PM CET/12:30 PM EST)
  • Ufa @ Atlant, Thursday Apr 14 (5:30 PM CET/12:30 PM EST)

Games 5-7 are as yet unscheduled, but every second day is the KHL standard, so expect Game 5 to be on Saturday, like an early start. 

Loading comments...

File Description: UUEE - Sheremetyevo International - Moscow, Russia Sheremetyevo International Airport is an international airport located in the Moscow Oblast, Russia, 29 km (18 mi) north-west of central Moscow. It is a hub for the passenger operations of the Russian international airline Aeroflot, and one of the three major airports serving Moscow along with Domodedovo International Airport and Vnukovo International Airport. It is now the 2nd largest airport in Russia after Domodedovo. In 2010, the airport handled 19,329,000 passengers and 184,488 aircraft movements. Using ADEx 1.52, aeronautical maps and Google Earth the default AFCAD has been completely reworked to reflect today's situation including gate assignments.

File Description: UUEE Sheremetyevo International Airport updated with the new 6L-24R runway.

No Prev Page

AVSIM Library System Version 2.00 -- 2004-May-01 © 2001-2024 AVSIM Online All Rights Reserved

Rusmania

  • Yekaterinburg
  • Novosibirsk
  • Vladivostok

week 2 assignment feature engineering

  • Tours to Russia
  • Practicalities
  • Russia in Lists
Rusmania • Deep into Russia

Out of the Centre

Savvino-storozhevsky monastery and museum.

Savvino-Storozhevsky Monastery and Museum

Zvenigorod's most famous sight is the Savvino-Storozhevsky Monastery, which was founded in 1398 by the monk Savva from the Troitse-Sergieva Lavra, at the invitation and with the support of Prince Yury Dmitrievich of Zvenigorod. Savva was later canonised as St Sabbas (Savva) of Storozhev. The monastery late flourished under the reign of Tsar Alexis, who chose the monastery as his family church and often went on pilgrimage there and made lots of donations to it. Most of the monastery’s buildings date from this time. The monastery is heavily fortified with thick walls and six towers, the most impressive of which is the Krasny Tower which also serves as the eastern entrance. The monastery was closed in 1918 and only reopened in 1995. In 1998 Patriarch Alexius II took part in a service to return the relics of St Sabbas to the monastery. Today the monastery has the status of a stauropegic monastery, which is second in status to a lavra. In addition to being a working monastery, it also holds the Zvenigorod Historical, Architectural and Art Museum.

Belfry and Neighbouring Churches

week 2 assignment feature engineering

Located near the main entrance is the monastery's belfry which is perhaps the calling card of the monastery due to its uniqueness. It was built in the 1650s and the St Sergius of Radonezh’s Church was opened on the middle tier in the mid-17th century, although it was originally dedicated to the Trinity. The belfry's 35-tonne Great Bladgovestny Bell fell in 1941 and was only restored and returned in 2003. Attached to the belfry is a large refectory and the Transfiguration Church, both of which were built on the orders of Tsar Alexis in the 1650s.  

week 2 assignment feature engineering

To the left of the belfry is another, smaller, refectory which is attached to the Trinity Gate-Church, which was also constructed in the 1650s on the orders of Tsar Alexis who made it his own family church. The church is elaborately decorated with colourful trims and underneath the archway is a beautiful 19th century fresco.

Nativity of Virgin Mary Cathedral

week 2 assignment feature engineering

The Nativity of Virgin Mary Cathedral is the oldest building in the monastery and among the oldest buildings in the Moscow Region. It was built between 1404 and 1405 during the lifetime of St Sabbas and using the funds of Prince Yury of Zvenigorod. The white-stone cathedral is a standard four-pillar design with a single golden dome. After the death of St Sabbas he was interred in the cathedral and a new altar dedicated to him was added.

week 2 assignment feature engineering

Under the reign of Tsar Alexis the cathedral was decorated with frescoes by Stepan Ryazanets, some of which remain today. Tsar Alexis also presented the cathedral with a five-tier iconostasis, the top row of icons have been preserved.

Tsaritsa's Chambers

week 2 assignment feature engineering

The Nativity of Virgin Mary Cathedral is located between the Tsaritsa's Chambers of the left and the Palace of Tsar Alexis on the right. The Tsaritsa's Chambers were built in the mid-17th century for the wife of Tsar Alexey - Tsaritsa Maria Ilinichna Miloskavskaya. The design of the building is influenced by the ancient Russian architectural style. Is prettier than the Tsar's chambers opposite, being red in colour with elaborately decorated window frames and entrance.

week 2 assignment feature engineering

At present the Tsaritsa's Chambers houses the Zvenigorod Historical, Architectural and Art Museum. Among its displays is an accurate recreation of the interior of a noble lady's chambers including furniture, decorations and a decorated tiled oven, and an exhibition on the history of Zvenigorod and the monastery.

Palace of Tsar Alexis

week 2 assignment feature engineering

The Palace of Tsar Alexis was built in the 1650s and is now one of the best surviving examples of non-religious architecture of that era. It was built especially for Tsar Alexis who often visited the monastery on religious pilgrimages. Its most striking feature is its pretty row of nine chimney spouts which resemble towers.

week 2 assignment feature engineering

Plan your next trip to Russia

Ready-to-book tours.

Your holiday in Russia starts here. Choose and book your tour to Russia.

REQUEST A CUSTOMISED TRIP

Looking for something unique? Create the trip of your dreams with the help of our experts.

Apple unveils stunning new iPad Pro with the world’s most advanced display, M4 chip, and Apple Pencil Pro

The new iPad Pro.

Thinnest Apple Product Ever

A side profile of iPad Pro showing its thinness.

World’s Most Advanced Display

The Ultra Retina XDY display showcasing beautiful landscape scenery on the new iPad Pro.

Only Possible with M4

The Octane app disabled on iPad Pro.

Outrageously Powerful Device for AI

Pro Cameras

A close up look at the pro camera system on the new iPad Pro.

Pro Connectivity

Apple Pencil Pro

The Apple Pencil Pro attached to the new iPad Pro.

All-New Magic Keyboard and Smart Folio

Powerful iPadOS Features

Reference Mode on iPad Pro.

Logic Pro for iPad 2

Session Players in Logic Pro for iPad 2 displayed on iPad Pro.

Final Cut Pro for iPad 2

Live Multicam in Final Cut Pro for iPad 2 displayed on iPad Pro.

iPad Pro and the Environment

  • Customers can order the new iPad Pro with M4 starting today, May 7, at apple.com/store , and in the Apple Store app in 29 countries and regions, including the U.S., with availability in stores beginning Wednesday, May 15.
  • The new 11-inch and 13-inch iPad Pro will be available in silver and space black finishes in 256GB, 512GB, 1TB, and 2TB configurations.
  • The 11-inch iPad Pro starts at  $999  (U.S.) for the Wi-Fi model, and  $1,199  (U.S.) for the Wi-Fi + Cellular model. The 13-inch iPad Pro starts at  $1,299  (U.S.) for the Wi-Fi model, and  $1,499  (U.S.) for the Wi-Fi + Cellular model. Additional technical specifications, including nano-texture glass options, are available at apple.com/store .
  • For education, the new 11-inch iPad Pro is available for  $899  (U.S.) and the 13-inch iPad Pro is $1,199 (U.S.). Education pricing is available to current and newly accepted college students and their parents, as well as faculty, staff, and home-school teachers of all grade levels. For more information, visit  apple.com/us-hed/shop .
  • The new Apple Pencil Pro is compatible with the new iPad Pro. It is available for $129 (U.S.). For education, Apple Pencil Pro is available for $119 (U.S.).
  • Apple Pencil (USB-C) is compatible with the new iPad Pro. It is available for $79 (U.S.) and $69 (U.S.) for education.
  • The new Magic Keyboard is compatible with the new iPad Pro. It is available in black and white finishes. The new 11-inch Magic Keyboard is available for $299 (U.S.) and the new 13-inch Magic Keyboard is available for $349 (U.S.), with layouts for over 30 languages. For education, the 11-inch Magic Keyboard is available for $279 (U.S.) and the 13-inch Magic Keyboard is available for $329 (U.S.).
  • The new Smart Folio is available for $79 (U.S.) in black, white, and denim finishes for the new 11-inch iPad Pro and $99 (U.S.) for the new 13-inch iPad Pro.
  • Logic Pro for iPad 2 is available on May 13 as a free update for existing users, and for new users, it is available on the App Store for $4.99 (U.S.) per month, or $49 (U.S.) per year, with a one-month free trial. Logic Pro for iPad 2 requires iPadOS 17.4 or later. For more information, visit apple.com/logic-pro-for-ipad .
  • Final Cut Pro for iPad 2 will be available later this spring on the App Store for $4.99 (U.S.) per month, or $49 (U.S.) per year, with a one-month free trial.
  • Apple offers great ways to save on the latest iPad. Customers can trade in their current iPad and get credit toward a new one by visiting the Apple Store online , the Apple Store app, or an Apple Store location. To see what their device is worth, and for terms and conditions, customers can visit apple.com/shop/trade-in .
  • Customers in the U.S. who shop at Apple using Apple Card can pay monthly at 0 percent APR when they choose to check out with Apple Card Monthly Installments, and they’ll get 3 percent Daily Cash back — all upfront.

Text of this article

May 7, 2024

PRESS RELEASE

Featuring a new thin and light design, breakthrough Ultra Retina XDR display, and outrageously fast M4 performance with powerful AI capabilities, the new iPad Pro takes a huge leap forward

CUPERTINO, CALIFORNIA Apple today unveiled the groundbreaking new iPad Pro in a stunningly thin and light design, taking portability and performance to the next level. Available in silver and space black finishes, the new iPad Pro comes in two sizes: an expansive 13-inch model and a super-portable 11-inch model. Both sizes feature the world’s most advanced display — a new breakthrough Ultra Retina XDR display with state-of-the-art tandem OLED technology — providing a remarkable visual experience. The new iPad Pro is made possible with the new M4 chip, the next generation of Apple silicon, which delivers a huge leap in performance and capabilities. M4 features an entirely new display engine to enable the precision, color, and brightness of the Ultra Retina XDR display. With a new CPU, a next-generation GPU that builds upon the GPU architecture debuted on M3, and the most powerful Neural Engine yet, the new iPad Pro is an outrageously powerful device for artificial intelligence. The versatility and advanced capabilities of iPad Pro are also enhanced with all-new accessories. Apple Pencil Pro brings powerful new interactions that take the pencil experience even further, and a new thinner, lighter Magic Keyboard is packed with incredible features. The new iPad Pro, Apple Pencil Pro, and Magic Keyboard are available to order starting today, with availability in stores beginning Wednesday, May 15.

“iPad Pro empowers a broad set of pros and is perfect for anyone who wants the ultimate iPad experience — with its combination of the world’s best displays, extraordinary performance of our latest M-series chips, and advanced accessories — all in a portable design. Today, we’re taking it even further with the new, stunningly thin and light iPad Pro, our biggest update ever to iPad Pro,” said John Ternus, Apple’s senior vice president of Hardware Engineering. “With the breakthrough Ultra Retina XDR display, the next-level performance of M4, incredible AI capabilities, and support for the all-new Apple Pencil Pro and Magic Keyboard, there’s no device like the new iPad Pro.”

The new iPad Pro — the thinnest Apple product ever — features a stunningly thin and light design, taking portability to a whole new level. The 11-inch model is just 5.3 mm thin, and the 13-inch model is even thinner at a striking 5.1 mm, while both models are just as strong as the previous design. The 11-inch model weighs less than a pound, and the 13-inch model is nearly a quarter pound lighter than its predecessor — allowing pro users to extend their workflows in new ways and in more places. The new iPad Pro is available in two gorgeous finishes — silver and space black — both with 100 percent recycled aluminum enclosures.

The new iPad Pro debuts the Ultra Retina XDR, the world’s most advanced display, to provide an even more remarkable visual experience. The Ultra Retina XDR display features state-of-the-art tandem OLED technology that uses two OLED panels and combines the light from both to provide phenomenal full-screen brightness. The new iPad Pro supports an incredible 1000 nits of full-screen brightness for SDR and HDR content, and 1600 nits peak for HDR. No other device of its kind delivers this level of extreme dynamic range. Tandem OLED technology enables sub-millisecond control over the color and luminance of each pixel, taking XDR precision further than ever. Specular highlights in photos and video appear even brighter, and there’s more detail in shadows and low light than ever before on iPad — all while delivering even more responsiveness to content in motion. For pro users working in high-end, color-managed workflows or challenging lighting conditions, a new nano-texture glass option comes to iPad Pro for the first time. 1 Nano-texture glass is precisely etched at a nanometer scale, maintaining image quality and contrast while scattering ambient light for reduced glare. With its breakthrough tandem OLED technology, extreme brightness, incredibly precise contrast, brilliant colors, and nano-texture glass option, the new Ultra Retina XDR display is the world’s most advanced display, giving iPad Pro customers an unparalleled viewing experience.

The incredibly thin and light design and game-changing display of the new iPad Pro is only possible with M4, the next generation of Apple silicon that delivers a huge leap in performance. M4 is built on second-generation 3-nanometer technology that’s even more power efficient, which is perfect for the design of the new iPad Pro. With an entirely new display engine, M4 introduces pioneering technology for the stunning precision, color, and brightness of the Ultra Retina XDR display. The new CPU offers up to four performance cores and now six efficiency cores, 2 with next-generation machine learning (ML) accelerators, to deliver up to 1.5x faster CPU performance over M2 in the previous-generation iPad Pro. 3 M4 builds on the GPU architecture of M3 — the 10-core GPU includes powerful features like Dynamic Caching, and hardware-accelerated mesh shading and ray tracing, which come to iPad for the first time. Coupled with higher unified memory bandwidth, pro rendering apps like Octane will see up to 4x faster performance than M2. 3 M4 also delivers tremendous gains and industry-leading performance per watt. Compared to M2, M4 can deliver the same performance using just half the power, and compared to the latest PC chip in a thin and light laptop, M4 can deliver the same performance using just a quarter of the power. 4 A new advanced Media Engine includes support for AV1 decode, providing more power-efficient playback of high-resolution video experiences from streaming services.

The new iPad Pro with M4 features Apple’s most powerful Neural Engine ever, capable of 38 trillion operations per second, which is 60x faster than Apple’s first Neural Engine in the A11 Bionic chip. Combined with next-generation ML accelerators in the CPU, a high-performance GPU, more memory bandwidth, and intelligent features and powerful developer frameworks in iPadOS, the Neural Engine makes the new iPad Pro an outrageously powerful device for AI. With iPad Pro with M4, users can perform AI-enabled tasks even faster, like easily isolate a subject from its background in 4K video with just a tap with Scene Removal Mask in Final Cut Pro. With this advanced level of performance, the Neural Engine in M4 is more powerful than any neural processing unit in any AI PC today.

iPadOS also has advanced frameworks like Core ML that make it easy for developers to tap into the Neural Engine to deliver phenomenal AI features locally, including running powerful diffusion and generative AI models, with great performance on device. iPad Pro also supports cloud-based solutions, enabling users to run powerful productivity and creative apps that tap into the power of AI, such as Copilot for Microsoft 365 and Adobe Firefly.

The updated camera system on the new iPad Pro delivers even more versatility, and with its rich audio from four studio-quality mics, users can shoot, edit, and share all on one device. The 12MP back camera captures vibrant Smart HDR images and video with even better color, improved textures, and detail in low light. It also now features a new adaptive True Tone flash that makes document scanning on the new iPad Pro better than ever. Using AI, the new iPad Pro automatically identifies documents right in the Camera app, and if a shadow is in the way, it instantly takes multiple photos with the new adaptive flash, stitching the scan together for a dramatically better scan.

On the front, the TrueDepth camera system moves to the landscape location on the new iPad Pro. The Ultra Wide 12MP camera with Center Stage makes the experience of video conferencing in landscape orientation even better, especially when iPad is attached to a Magic Keyboard or Smart Folio.

iPad Pro includes a high-performance USB-C connector with support for Thunderbolt 3 and USB 4, delivering fast wired connectivity — up to 40Gb/s. Thunderbolt supports an extensive ecosystem of high-performance accessories, including external displays like the Pro Display XDR at its full 6K resolution, and external storage, all connected using high-performance cables and docks. iPad Pro supports Wi-Fi 6E for super-fast Wi-Fi connections for pro workflows on the go. Wi-Fi + Cellular models with 5G allow users to access their files, communicate with colleagues, and back up their data in a snap while on the go. Cellular models of the new iPad Pro are activated with eSIM, a more secure alternative to a physical SIM card, allowing users to quickly connect and transfer their existing plans digitally, and store multiple cellular plans on a single device. Customers can easily get connected to wireless data plans on the new iPad Pro in over 190 countries and regions around the world without needing to get a physical SIM card from a local carrier.

Apple Pencil Pro features even more magical capabilities and powerful new interactions that take the Apple Pencil experience even further. A new sensor in the barrel can sense a user’s squeeze, bringing up a tool palette to quickly switch tools, line weights, and colors, all without interrupting the creative process. A custom haptic engine delivers a light tap that provides confirmation when users squeeze, use double-tap, or snap to a Smart Shape for a remarkably intuitive experience. A gyroscope allows users to roll Apple Pencil Pro for precise control of the tool they’re using. Rotating the barrel changes the orientation of shaped pen and brush tools, just like pen and paper. And with Apple Pencil hover, users can visualize the exact orientation of a tool before making a mark.

With these advanced features, Apple Pencil Pro allows users to bring their ideas to life in entirely new ways, and developers can also create their own custom interactions. Apple Pencil Pro brings support for Find My for the first time to Apple Pencil, helping users locate Apple Pencil Pro if misplaced. It pairs, charges, and is stored on the side of iPad Pro through a new magnetic interface. iPad Pro also supports Apple Pencil (USB-C), ideal for note taking, sketching, annotating, journaling, and more, at an incredible value.

Designed for the new iPad Pro, an all-new thinner and lighter Magic Keyboard makes it more portable and versatile than ever. The new Magic Keyboard opens to the magical floating design that customers love, and now includes a function row for access to features like screen brightness and volume controls. It also has a gorgeous aluminum palm rest and larger trackpad that’s even more responsive with haptic feedback, so the entire experience feels just like using a MacBook. The new Magic Keyboard attaches magnetically, and the Smart Connector immediately connects power and data without the need for Bluetooth. The machined aluminum hinge also includes a USB-C connector for charging. The new Magic Keyboard comes in two colors that perfectly complement the new iPad Pro: black with a space black aluminum palm rest, and white with a silver aluminum palm rest.

The new Smart Folio for iPad Pro attaches magnetically and now supports multiple viewing angles for greater flexibility. Available in black, white, and denim, it complements the colors of the new iPad Pro.

iPadOS is packed with features that push the boundaries of what’s possible on iPad. With Reference Mode, iPadOS can precisely match color requirements of the Ultra Retina XDR display for tasks in which accurate colors and consistent image quality are critical — including review and approve, color grading, and compositing. Stage Manager enables users to work with multiple overlapping windows in a single view, resize windows, tap to switch between apps, and more. With full external display support of up to 6K, iPad Pro users can also extend their workflow, as well as use the built-in camera on an external display for enhanced video conferencing. Users can take advantage of the powerful AI capabilities in iPad Pro and intelligent features in iPadOS, including Visual Look Up, Subject Lift, Live Text, or Live Captions and Personal Voice for accessibility.

With iPadOS 17 , users can customize the Lock Screen to make it more personal — taking advantage of the larger display on iPad — and interactive widgets take glanceable information further with the ability to get tasks done right in the moment with just a tap. The Notes app gives users new ways to organize, read, annotate, and collaborate on PDFs, and working with PDFs is also easier with AutoFill, which intelligently identifies and fills fields in forms.

Logic Pro for iPad 2 , available starting Monday, May 13, introduces incredible studio assistant features that augment the music-making process and provide artists help right when they need it — all while ensuring they maintain full creative control. These features include Session Players, which expand on popular Drummer capabilities in Logic to include a new Bass Player and Keyboard Player; ChromaGlow, to instantly add warmth to tracks; and Stem Splitter, to extract and work with individual parts of a single audio recording.

Final Cut Pro for iPad 2 , available later this spring, introduces Live Multicam, a new feature that transforms iPad into a mobile production studio, allowing users to view and control up to four connected iPhone and iPad devices wirelessly. 5 To support Live Multicam, an all-new capture app also comes to iPad and iPhone, Final Cut Camera, 6 giving users control over options like white balance, ISO, and shutter speed, along with monitoring tools like overexposure indicators and focus peaking. Final Cut Camera works as a standalone capture app or with Live Multicam. Final Cut Pro for iPad 2 also allows users to create or open projects from external storage, giving editors even more flexibility, and offers new content options. 7

The new iPad Pro is designed with the environment in mind, including 100 percent recycled aluminum in the enclosure, 100 percent recycled rare earth elements in all magnets, and 100 percent recycled gold plating and tin soldering in multiple printed circuit boards. The new iPad Pro meets Apple’s high standards for energy efficiency, and is free of mercury, brominated flame retardants, and PVC. The packaging is 100 percent fiber-based, bringing Apple closer to its goal to remove plastic from all packaging by 2025.

Today, Apple is carbon neutral for global corporate operations, and by 2030, plans to be carbon neutral across the entire manufacturing supply chain and life cycle of every product.

Pricing and Availability

  • Nano-texture glass is an option on the 1TB and 2TB configurations of the 11-inch and 13-inch iPad Pro models.
  • iPad Pro models with 256GB or 512GB storage feature the Apple M4 chip with a 9‑core CPU. iPad Pro models with 1TB or 2TB storage feature the Apple M4 chip with a 10‑core CPU.
  • Testing was conducted by Apple in March and April 2024. See apple.com/ipad-pro for more information.
  • Testing was conducted by Apple in March and April 2024 using preproduction 13-inch iPad Pro (M4) units with a 10-core CPU and 16GB of RAM. Performance was measured using select industry‑standard benchmarks. PC laptop chip performance data is from testing ASUS Zenbook 14 OLED (UX3405MA) with Core Ultra 7 155H and 32GB of RAM. Performance tests are conducted using specific computer systems and reflect the approximate performance of iPad Pro.
  • Final Cut Pro for iPad 2 is compatible with iPad models with the M1 chip or later, and Logic Pro for iPad 2 will be available on iPad models with the A12 Bionic chip or later.
  • Final Cut Camera is compatible with iPhone X S and later with iOS 17.4 or later, and iPad models compatible with iPadOS 17.4 or later.
  • External project support requires iPadOS 17.5 or later.

Press Contacts

Tara Courtney

[email protected]

[email protected]

Apple Media Helpline

[email protected]

Images in this article

Navigation Menu

Search code, repositories, users, issues, pull requests..., provide feedback.

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly.

To see all available qualifiers, see our documentation .

  • Notifications

Machine Learning Engineering for Production (MLOps) Coursera Specialization

mattborghi/mlops-specialization

Folders and files, repository files navigation, course notes [english].

Course #1: Introduction to Machine Learning in Production

Course #2: Machine Learning Data Lifecycle in Production

Course #3: Machine Learning Modeling Pipelines in Production

Course #4: Deploying Machine Learning Models in Production

  • Jupyter Notebook 99.9%
  • Python 0.1%

IMAGES

  1. Key steps in the feature engineering process

    week 2 assignment feature engineering

  2. What is feature engineering

    week 2 assignment feature engineering

  3. The Feature Engineering Guide

    week 2 assignment feature engineering

  4. #02 Feature Engineering: Principles for choosing right features

    week 2 assignment feature engineering

  5. Feature Engineering for ML: Tools, Tips, FAQ, Reference Sources

    week 2 assignment feature engineering

  6. Feature Engineering Guide

    week 2 assignment feature engineering

VIDEO

  1. Swayam NPTEL

  2. iitm English 2 Week 8 Graded Assignment Solutions

  3. Ethics In Engineering Practice Week 2 Quiz Assignment Solution

  4. SWAYAM Data Base Management System Week 2 Assignment 2 Solution || NPTEL || Swayam || Jan 2024

  5. Solutions of NPTEL CIM Assignment 2 Mar-Apr 2024

  6. IIT Madras Maths 2 Week 2 Graded Assignment Answers

COMMENTS

  1. MLOps-Specialization/C2-Machine-Learning-Data-Lifecycle-in ...

    Course notes, quizzes, and programming assignments for DeepLearning.AI's "Machine Learning Engineering for Production (MLOps) Specialization" - haocai1992/MLOps-Specialization

  2. [Week 2-code] Feature Engineering with TFX

    Feature Engineering with TFX. Goal : building a DATA PIPELINE using Tensorflow Extended (TFX) Dataset : Metro Interstate Traffic Volume dataset. Details. create an Interactive Context to run TFX components. use TFX ExampleGen to split dataset. use TFX StatisticsGen & TFX SchemaGen to generate stat & schema.

  3. Machine Learning Engineering for Production (MLOps ...

    Become a Machine Learning expert. Productionize your machine learning knowledge and expand your production engineering capabilities. Skills: Managing Machine Learning Production Systems, Deployment Pipelines, Model Pipelines, Data Pipelines, Machine Learning Engineering for Production, Human-level Performance (HLP), Concept Drift, Model Baseline, Project Scoping and Design, ML Deployment ...

  4. No anomalies found.

    .type <class 'tfx.types.standard_artifacts.Examples'>.uri: metro_traffic_pipeline/CsvExampleGen/examples/11.span: 0.split_names ["train", "eval"].version

  5. Key Learning Points from MLOps Specialization

    While machine learning ( ML) concepts are essential, production engineering capabilities are the key to deploying and delivering value from ML models in the real world. DeepLearning.AI and Coursera recently developed the MLOps Specialization course to share how to conceptualize, build, and maintain integrated ML systems.

  6. Week 2

    Week 2 — Key challenges in Modelling Week 3 — Data ML Data Lifecycle ML Data Lifecycle Week 1 — Collecting, labeling, and validating data Week 2 — Feature Engineering Week 2 — Feature Engineering Table of contents Introduction to Preprocessing Squeezing the most out of data

  7. Intro to Feature Engineering for Machine Learning with Python

    The two approaches to feature engineering. There are two main approaches to feature engineering for most tabular datasets: The checklist approach: using tried and tested methods to construct features. The domain-based approach: incorporating domain knowledge of the dataset's subject matter into constructing new features.

  8. Complete Guide to Feature Engineering: Zero to Hero

    Feature engineering helps in improving the performance of machine learning models magically. According to some surveys, data scientists spend their time on data preparation. See this figure below: Image 2. This clearly shows the importance of feature engineering in machine learning.

  9. Machine Learning Systems Pt. 2: Data Pipelines with TensorFlow ...

    I'll go through the final assignment here, but I'll be applying it to a new dataset. The dataset I'll be working with is this Stroke Prediction Dataset via Kaggle. Table of Contents. Data Ingestion; Feature Selection; Data Validation and Pipeline; Feature Engineering; ML Metadata; Summary; Data Ingestion

  10. Feature Engineering for Machine Learning: A step by step Guide ...

    In conclusion, feature engineering is a crucial step in the data science process that involves transforming, constructing, selecting, and extracting meaningful features from raw data. Just like ...

  11. C2W2 _Assignment for Feature Engineering

    week-2. laidson February 23, 2024, 2:34pm 1. Link to the classroom item you are referring to: ... You posted in the "Machine Learning Specialization" course, but MLS C2 W2 doesn't have an assignment about Feature Engineering. You can move your thread to a different course forum by using the "pencil" icon in the thread title. 1 Like.

  12. Week 2 Assignment: Feature Engineering

    Week 2 Assignment: Feature Engineering. Course Q&A. Generative Adversarial Networks (GANS) Build Better Generative Adversarial Networks. week-2. LAMA_SAEED_OTHMAN_AL June 25, 2023, 7:50pm 1. There was a problem compiling the code from your notebook. ... You filed this under the GANs specialization, but I do not recognize the assignment. So I ...

  13. Programming Assignment: Feature Engineering

    Machine Learning Engineering for Production(MLOps) Machine Learning Data Lifecycle in Production. week-2

  14. Coursera: Machine Learning (Week 2) [Assignment Solution]

    163. Linear regression and get to see it work on data. I have recently completed the Machine Learning course from Coursera by Andrew NG. While doing the course we have to go through various quiz and assignments. Here, I am sharing my solutions for the weekly assignments throughout the course. These solutions are for reference only.

  15. Week 10

    Week 10 - Challenge 2- Attachment Feature Creation (Dog House) Attachment Feature Creation - Dog House AIM Create the Dog House for the Door Trim considering the design rules. INTRODUCTION DOG HOUSE The dog house is an engineering feature used in plastic design. Dog houses are used as a support feature.

  16. The flag of Elektrostal, Moscow Oblast, Russia which I bought there

    Its a city in the Moscow region. As much effort they take in making nice flags, as low is the effort in naming places. The city was founded because they built factories there.

  17. coursera-machine-learning-engineering-for-prod-mlops ...

    Programming assignments and quizzes from all courses within the Machine Learning Engineering for Production (MLOps) specialization offered by deeplearning.ai - coursera-machine-learning-engineering-for-prod-mlops-specialization/C2 - Machine Learning Data Lifecycle in Production/Week 2/C2_W2_Lab_3_Feature_Selection.ipynb at main · amanchadha/coursera-machine-learning-engineering-for-prod-mlops ...

  18. The Possible Collapse of the U.S. Home Insurance System

    A Times investigation found climate change may now be a concern for every homeowner in the country.

  19. lindseyvanosky/Feature-Engineering-Exercise

    Stack 3 - Week 2 Assignment. Contribute to lindseyvanosky/Feature-Engineering-Exercise development by creating an account on GitHub.

  20. Gagarin Cup Preview: Atlant vs. Salavat Yulaev

    Erik Ersberg (93.2%) Much like the Elitserien Finals, we have a bit of an offense vs. defense match-up in this league Final. While Ufa let their star top line of Alexander Radulov, Patrick Thoresen and Igor Grigorenko loose on the KHL's Western Conference, Mytischi played a more conservative style, relying on veterans such as former NHLers Jan ...

  21. AVSIM Library

    It features same airport layout as in the full version but with limited 3D objects and low resolution textures. ... maps and Google Earth the default AFCAD has been completely reworked to reflect today's situation including gate assignments. Filename: fsx_uuee.zip: License: Freeware: Added: 16th November 2012, 18:38:22: ... AVSIM Library System ...

  22. Savvino-Storozhevsky Monastery and Museum

    Zvenigorod's most famous sight is the Savvino-Storozhevsky Monastery, which was founded in 1398 by the monk Savva from the Troitse-Sergieva Lavra, at the invitation and with the support of Prince Yury Dmitrievich of Zvenigorod. Savva was later canonised as St Sabbas (Savva) of Storozhev. The monastery late flourished under the reign of Tsar ...

  23. Apple unveils stunning new iPad Pro with M4 chip and Apple Pencil Pro

    CUPERTINO, CALIFORNIA Apple today unveiled the groundbreaking new iPad Pro in a stunningly thin and light design, taking portability and performance to the next level. Available in silver and space black finishes, the new iPad Pro comes in two sizes: an expansive 13-inch model and a super-portable 11-inch model. Both sizes feature the world's most advanced display — a new breakthrough ...

  24. Machine Learning Engineering for Production (MLOps ...

    Become a Machine Learning expert. Productionize your machine learning knowledge and expand your production engineering capabilities. Skills: Managing Machine Learning Production Systems, Deployment Pipelines, Model Pipelines, Data Pipelines, Machine Learning Engineering for Production, Human-level Performance (HLP), Concept Drift, Model Baseline, Project Scoping and Design, ML Deployment ...

  25. GitHub

    Machine Learning Engineering for Production (MLOps) Coursera Specialization - mattborghi/mlops-specialization