Sign Up

Have an account? Sign In Now

Sign In

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

Sorry, you do not have a permission to ask a question, You must login to ask question.

Forgot Password?

Need An Account, Sign Up Here
Sign InSign Up

ErrorCorner

ErrorCorner Logo ErrorCorner Logo

ErrorCorner Navigation

  • Home
  • Contact Us
  • About Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Contact Us
  • About Us
Home/ Questions/Q 315
Next
Answered
Kenil Vasani
Kenil Vasani

Kenil Vasani

  • 646 Questions
  • 567 Answers
  • 77 Best Answers
  • 26 Points
View Profile
  • 6
Kenil Vasani
Asked: December 10, 20202020-12-10T22:48:37+00:00 2020-12-10T22:48:37+00:00In: Python

Trying to merge 2 dataframes but get ValueError

  • 6

These are my two dataframes saved in two variables:

> print(df.head())
>
          club_name  tr_jan  tr_dec  year
    0  ADO Den Haag    1368    1422  2010
    1  ADO Den Haag    1455    1477  2011
    2  ADO Den Haag    1461    1443  2012
    3  ADO Den Haag    1437    1383  2013
    4  ADO Den Haag    1386    1422  2014
> print(rankingdf.head())
>
           club_name  ranking  year
    0    ADO Den Haag    12    2010
    1    ADO Den Haag    13    2011
    2    ADO Den Haag    11    2012
    3    ADO Den Haag    14    2013
    4    ADO Den Haag    17    2014

I’m trying to merge these two using this code:

new_df = df.merge(ranking_df, on=['club_name', 'year'], how='left')

The how=’left’ is added because I have less datapoints in my ranking_df than in my standard df.

The expected behaviour is as such:

> print(new_df.head()) 
> 

      club_name  tr_jan  tr_dec  year    ranking
0  ADO Den Haag    1368    1422  2010    12
1  ADO Den Haag    1455    1477  2011    13
2  ADO Den Haag    1461    1443  2012    11
3  ADO Den Haag    1437    1383  2013    14
4  ADO Den Haag    1386    1422  2014    17

But I get this error:

ValueError: You are trying to merge on object and int64 columns. If
you wish to proceed you should use pd.concat

But I do not wish to use concat since I want to merge the trees not just add them on.

Another behaviour that’s weird in my mind is that my code works if I save the first df to .csv and then load that .csv into a dataframe.

The code for that:

df = pd.DataFrame(data_points, columns=['club_name', 'tr_jan', 'tr_dec', 'year'])
df.to_csv('preliminary.csv')

df = pd.read_csv('preliminary.csv', index_col=0)

ranking_df = pd.DataFrame(rankings, columns=['club_name', 'ranking', 'year'])

new_df = df.merge(ranking_df, on=['club_name', 'year'], how='left')

I think that it has to do with the index_col=0 parameter. But I have no idea to fix it without having to save it, it doesn’t matter much but is kind of an annoyance that I have to do that.

dataframepandaspython
  • 1 1 Answer
  • 10 Views
  • 0 Followers
  • 0
Answer
Share
  • Facebook

    1 Answer

    • Voted
    1. Kenil Vasani

      Kenil Vasani

      • 646 Questions
      • 567 Answers
      • 77 Best Answers
      • 26 Points
      View Profile
      Best Answer
      Kenil Vasani
      2020-12-10T22:47:00+00:00Added an answer on December 10, 2020 at 10:47 pm

      In one of your dataframes the year is a string and the other it is an int64
      you can convert it first and then join (e.g. df['year']=df['year'].astype(int) or as RafaelC suggested df.year.astype(int))

      Edit: Also note the comment by Anderson Zhu: Just in case you have None or missing values in one of your dataframes, you need to use Int64 instead of int. See the reference here.

      • 7
      • Share
        Share
        • Share on Facebook
        • Share on Twitter
        • Share on LinkedIn
        • Share on WhatsApp

    You must login to add an answer.

    Forgot Password?

    Sidebar

    Ask A Question
    • Popular
    • Kenil Vasani

      SyntaxError: invalid syntax to repo init in the AOSP code

      • 5 Answers
    • Kenil Vasani

      runtimeError: package fails to pass a sanity check for numpy ...

      • 3 Answers
    • Kenil Vasani

      xlrd.biffh.XLRDError: Excel xlsx file; not supported

      • 3 Answers
    • Kenil Vasani

      Homebrew fails on MacOS Big Sur

      • 3 Answers
    • Kenil Vasani

      Error: PostCSS plugin tailwindcss requires PostCSS 8

      • 2 Answers

    Explore

    • Most Answered
    • Most Visited
    • Most Voted
    • Random

    © 2020-2021 ErrorCorner. All Rights Reserved
    by ErrorCorner.com