King County Home Sales

Authors: Jerry Vasquez, Paul Lindquist, Vu Brown

Overview

This project assumes that we work for a real estate company in King County, Washington. We're tasked with providing data-driven recommendations for buying and selling homes.

Business Problem

The company wants recommendations for customers looking to buy a home and for customers looking to sell their home.

As such, recommendations will be divided into 2 categories: buying homes and selling homes.

Buying homes:

Use housing data (comps) of recently sold homes to create an inferential model
- Model will be built using linear regression and selected variables (features) that best work with the target (price)
Use the inferential model to assess what houses are underpriced or overpriced for customers looking to buy a home

Selling homes:

Build a predictive model using comps of recently sold homes
- Model will also be built using linear regression but use a more wholistic approach with features that could possibly add additional value
Use the predictive model to recommend what price should be assigned to homes that will be sold by the company

Data

This project uses the King County House Sales dataset, which can be found here.

Methods

We'll conduct descriptive analysis, linear regression and modeling (inferential, predictive).

Results

These are home prices across King County that comprise 3 standard deviations of our dataset. The more expensive houses tend to have a view/waterfront. We can also see a cluster of higher-priced homes centered around western Bellevue and Mercer Island.

These are the coefficients of the features we used in our final inferential model. The negative values for bedrooms, bathrooms and lot size are indicative of the challenges our models encountered with the dataset.

Running our predictive model built on 12 home features, we're able to predict what price point should be used when selling a customer's house. Given the high price variance, it's not recommended to solely use our predictive model in determining price.

Conclusions

Given the provided dataset and linear regression approach, both our inferential and predictive models did not perform as well as we had hoped.

For the inferential model, homoscedasticity was poor, multicollinearity (VIF) was too high and the RMSE range was north of $173k+. Linearity and normality performed decently.
For the predictive model, multicollinearity (VIF) was too high for several features, the model score was an underwhelming .601629 and the RMSE range of $193k+ was too great to provide any predictive value. Linearity and homoscedasticity were decent and normality was excellent, due to normalizing the data in pre-processing.

We recommend not using our models for predictive purposes and perhaps looking into a different dataset and/or different modeling approaches other than linear regression.

For More Information

Please review our full analysis in our Jupyter Notebook or presentation deck.

For additional questions, please contact Jerry, Paul or Vu.

Repository Structure

├── README.md                           <- The top-level README for reviewers of this project
├── main_notebook.ipynb                 <- Narrative documentation of analysis in Jupyter notebook
├── project_presentation.pdf            <- PDF version of project presentation
├── data                                <- Both sourced externally and generated from code
└── images                              <- Both sourced externally and generated from code

Name		Name	Last commit message	Last commit date
Latest commit History 65 Commits
data		data
old_notebooks		old_notebooks
.gitignore		.gitignore
README.md		README.md
main_notebook.ipynb		main_notebook.ipynb
project_presentation.pdf		project_presentation.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

King County Home Sales

Overview

Business Problem

Data

Methods

Results

Conclusions

For More Information

Repository Structure

About

Releases

Packages

Contributors 3

Languages

paul-lindquist/king-county-home-sales

Folders and files

Latest commit

History

Repository files navigation

King County Home Sales

Overview

Business Problem

Data

Methods

Results

Conclusions

For More Information

Repository Structure

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages