Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing Data #535

Open
JeremyWhittaker opened this issue Oct 13, 2023 · 3 comments
Open

Missing Data #535

JeremyWhittaker opened this issue Oct 13, 2023 · 3 comments
Assignees
Labels
bug Something isn't working data

Comments

@JeremyWhittaker
Copy link

JeremyWhittaker commented Oct 13, 2023

I'm pulling financial data for symbol LEN. My data frame is below as well as a visual chart for revenues. I'm missing chunks of data. For small companies, this may be normal as perhaps they didn't release updated quarterly financials etc. But this company did in fact release this data. I was able to verify by looking at Yahoo Finance(screenshot also below). Is there going to be a lot of companies where your data isn't accurate like this? When I'm trying to analyze aggregate data this makes it incredibly difficult.

The python function I used to fetch this data is here:

def fetch_fundamental_data(api_key, stock_ticker, filing_date_gte, license_type="free"):
    """
    Fetch financial statement data for a given stock ticker using Polygon API.

    Parameters:
        api_key (str): Your Polygon API key.
        stock_ticker (str): Stock ticker symbol.
        filing_date_gte (str): Start date for the filing date filter.
        license_type (str): Type of license ("free" or "paid")

    Returns:
        list: A list containing the financial statement data.
    """
    # Create a REST client and authenticate with the API key
    client = RESTClient(api_key)

    # Initialize an empty list to hold the data
    data = []

    request_count = 0
    while True:
        try:
            # Fetch the financial statement data and add it to the list
            for t in client.vx.list_stock_financials(ticker=stock_ticker, filing_date_gte=filing_date_gte, limit=100):
                logging.info(f'{stock_ticker}: Got data from API {t}')
                data.append(t)

                # Check if the fetched data is empty
                if not data:
                    logging.warning(f"No data found for stock ticker {stock_ticker}. It may be an invalid symbol.")
                    return None

                request_count += 1

                # Log the latest filing date fetched
                latest_filing_date = t.filing_date  # Replace this with the actual attribute name for the date
                logging.info(f"Latest filing date fetched: {latest_filing_date}")

                if license_type == "free" and request_count >= 5:  # Check if the rate limit is reached
                    time.sleep(60)  # Pause for 60 seconds
                    request_count = 0  # Reset the request count

            break  # Exit the while loop if successful
        except PolygonAPIError as e:
            if "maximum requests per minute" in str(e):
                time.sleep(60)  # Pause for 60 seconds
            else:
                raise  # Re-raise the exception if it's not a rate-limit error

    return data

image

image

     cik       company_name    end_date filing_date fiscal_period  \

0 0000920760 LENNAR CORP /NEW/ 2023-08-31 2023-09-29 Q3
1 0000920760 LENNAR CORP /NEW/ 2023-05-31 2023-06-30 Q2
2 0000920760 LENNAR CORP /NEW/ 2023-02-28 2023-04-04 Q1
3 0000920760 LENNAR CORP /NEW/ 2022-11-30 2023-01-26 FY
4 0000920760 LENNAR CORP /NEW/ 2022-08-31 2022-10-04 Q3
5 0000920760 LENNAR CORP /NEW/ 2022-05-31 2022-07-01 Q2
6 0000920760 LENNAR CORP /NEW/ 2022-02-28 2022-04-01 Q1
7 0000920760 LENNAR CORP /NEW/ 2021-11-30 2022-01-28 FY
8 0000920760 LENNAR CORP /NEW/ 2021-08-31 2021-10-01 Q3
9 0000920760 LENNAR CORP /NEW/ 2021-05-31 2021-07-02 Q2
10 0000920760 LENNAR CORP /NEW/ 2021-02-28 2021-04-01 Q1
11 0000920760 LENNAR CORP /NEW/ 2020-11-30 2021-01-22 FY
12 0000920760 LENNAR CORP /NEW/ 2020-08-31 2020-10-01 Q3
13 0000920760 LENNAR CORP /NEW/ 2020-05-31 2020-07-06 Q2
14 0000920760 LENNAR CORP /NEW/ 2020-02-29 2020-04-07 Q1

fiscal_year current_liabilities equity_attributable_to_parent
0 2023 1.164958e+10 2.565662e+10
1 2023 -2.516112e+10 2.501514e+10
2 2023 -2.455529e+10 2.441826e+10
3 2022 1.374393e+10 2.410050e+10
4 2022 1.221234e+10 2.297728e+10
5 2022 -2.178977e+10 2.159826e+10
6 2022 -2.084743e+10 2.067906e+10
7 2021 1.221150e+10 2.081642e+10
8 2021 1.196465e+10 2.065019e+10
9 2021 -1.970210e+10 1.957611e+10
10 2021 -1.901745e+10 1.889625e+10
11 2020 1.183578e+10 1.799486e+10
12 2020 1.203494e+10 1.717210e+10
13 2020 -1.663262e+10 1.654270e+10
14 2020 -1.619338e+10 1.604460e+10

noncurrent_assets  noncurrent_liabilities  ...  \

0 0 0 ...
1 0 0 ...
2 0 0 ...
3 0 0 ...
4 0 0 ...
5 0 0 ...
6 0 0 ...
7 0 0 ...
8 0 0 ...
9 0 0 ...
10 0 0 ...
11 0 0 ...
12 0 0 ...
13 0 0 ...
14 0 0 ...

net_cash_flow_from_financing_activities  comprehensive_income_loss  \

0 -1.109284e+09 1.117160e+09
1 -5.745670e+08 8.783150e+08
2 -1.483463e+09 6.001590e+08
3 -1.277279e+09 4.652250e+09
4 -4.342220e+08 1.473036e+09
5 -1.322300e+08 1.322620e+09
6 -1.257886e+09 5.123420e+08
7 -2.404735e+09 4.456013e+09
8 -4.840860e+08 1.409349e+09
9 -1.881450e+08 8.370880e+08
10 -6.296880e+08 1.015967e+09
11 -2.446575e+09 2.466250e+09
12 -9.174200e+08 6.664180e+08
13 -1.587660e+08 5.174060e+08
14 -7.898040e+08 3.984520e+08

comprehensive_income_loss_attributable_to_parent  \

0 1.109204e+09
1 8.722670e+08
2 5.973850e+08
3 4.617874e+09
4 1.467686e+09
5 1.320818e+09
6 5.066080e+08
7 4.429575e+09
8 1.407019e+09
9 8.316790e+08
10 1.000427e+09
11 2.463733e+09
12 6.665930e+08
13 5.166160e+08
14 3.984060e+08

other_comprehensive_income_loss  basic_earnings_per_share  \

0 208000.0 3.87
1 573000.0 3.01
2 851000.0 2.06
3 3749000.0 15.74
4 342000.0 5.04
5 62000.0 4.50
6 3027000.0 1.70
7 -536000.0 14.28
8 131000.0 4.52
9 316000.0 2.66
10 -942000.0 3.20
11 -1303000.0 7.88
12 175000.0 2.13
13 -790000.0 1.66
14 -46000.0 1.27

operating_expenses      revenues  cost_of_revenue  gross_profit  symbol  

0 7.258891e+09 8.729603e+09 NaN NaN LEN
1 6.852312e+09 8.045151e+09 NaN NaN LEN
2 5.674155e+09 6.490429e+09 NaN NaN LEN
3 NaN NaN NaN NaN LEN
4 NaN NaN NaN NaN LEN
5 NaN NaN NaN NaN LEN
6 5.400582e+09 6.203516e+09 NaN NaN LEN
7 2.205224e+10 2.713068e+10 NaN NaN LEN
8 5.016908e+09 6.941403e+09 NaN NaN LEN
9 5.228150e+09 6.430245e+09 NaN NaN LEN
10 3.875609e+09 5.325468e+09 NaN NaN LEN
11 1.900665e+10 2.248885e+10 1.774076e+10 4.748090e+09 LEN
12 4.035370e+08 5.870254e+09 4.607704e+09 1.262550e+09 LEN
13 3.857330e+08 5.287373e+09 4.225063e+09 1.062310e+09 LEN
14 4.254360e+08 4.505337e+09 3.656349e+09 8.489880e+08 LEN

[15 rows x 29 columns]

@JeremyWhittaker JeremyWhittaker added the bug Something isn't working label Oct 13, 2023
@justinpolygon
Copy link
Contributor

justinpolygon commented Oct 13, 2023

Thanks for the heads up @JeremyWhittaker. Thank for the very detailed write up as it helps us track things down quickly. After taking a look, this is more of a data issue/gap than a client library issue so I pinged the backend data team and they will check it out. I'll keep you posted.

@JeremyWhittaker
Copy link
Author

Thanks for the heads up @JeremyWhittaker. Thank for the very detailed write up as it helps us track things down quickly. After taking a look, this is more of a data issue/gap than a client library issue so I pinged the backend data team and they will check it out. I'll keep you posted.

Appreciate it. I just signed up for your service and I'm trying to find reliable data to analyze, When I run across stuff like this it makes me start to question all of my output.

@JeremyWhittaker
Copy link
Author

Same symbol, huge chunks of data missing from this metric as well:

image

image

INFO:root: fiscal_period fiscal_year end_date filing_date cost_of_revenue
0 Q3 2023 2023-08-31 2023-09-29 NaN
1 Q2 2023 2023-05-31 2023-06-30 NaN
2 Q1 2023 2023-02-28 2023-04-04 NaN
3 FY 2022 2022-11-30 2023-01-26 NaN
4 Q3 2022 2022-08-31 2022-10-04 NaN
5 Q2 2022 2022-05-31 2022-07-01 NaN
6 Q1 2022 2022-02-28 2022-04-01 NaN
7 FY 2021 2021-11-30 2022-01-28 NaN
8 Q3 2021 2021-08-31 2021-10-01 NaN
9 Q2 2021 2021-05-31 2021-07-02 NaN
10 Q1 2021 2021-02-28 2021-04-01 NaN
11 FY 2020 2020-11-30 2021-01-22 1.774076e+10
12 Q3 2020 2020-08-31 2020-10-01 4.607704e+09
13 Q2 2020 2020-05-31 2020-07-06 4.225063e+09
14 Q1 2020 2020-02-29 2020-04-07 3.656349e+09
15 FY 2019 2019-11-30 2020-01-27 1.802340e+10
16 Q3 2019 2019-08-31 2019-10-08 4.771425e+09
17 Q2 2019 2019-05-31 2019-07-03 4.524303e+09
18 Q1 2019 2019-02-28 2019-04-08 3.142003e+09
19 FY 2018 2018-11-30 2019-01-28 1.688282e+10
20 Q3 2018 2018-08-31 2018-10-09 4.614666e+09
21 Q2 2018 2018-05-31 2018-07-06 4.619019e+09
22 Q1 2018 2018-02-28 2018-04-09 2.464163e+09
23 FY 2017 2017-11-30 2018-01-25 1.021241e+10
24 Q3 2017 2017-08-31 2017-10-10 2.611065e+09
25 Q2 2017 2017-05-31 2017-06-30 2.645017e+09
26 Q1 2017 2017-02-28 2017-04-10 1.918263e+09
27 FY 2016 2016-11-30 2017-01-20 8.754335e+09
28 Q3 2016 2016-08-31 2016-10-04 2.282218e+09
29 Q2 2016 2016-05-31 2016-07-01 2.184292e+09
30 Q1 2016 2016-02-29 2016-04-06 1.594718e+09
31 FY 2015 2015-11-30 2016-01-22 NaN
32 Q3 2015 2015-08-31 2015-10-09 NaN
33 Q2 2015 2015-05-31 2015-07-02 NaN
34 Q1 2015 2015-02-28 2015-04-03 NaN
35 FY 2014 2014-11-30 2015-01-23 NaN
36 Q3 2014 2014-08-31 2014-10-03 NaN
37 Q2 2014 2014-05-31 2014-07-03 NaN
38 Q1 2014 2014-02-28 2014-04-09 NaN
39 FY 2013 2013-11-30 2014-01-28 NaN
40 Q3 2013 2013-08-31 2013-10-10 NaN
41 Q2 2013 2013-05-31 2013-07-10 NaN
42 Q1 2013 2013-02-28 2013-04-09 NaN
43 FY 2012 2012-11-30 2013-01-29 NaN
44 Q3 2012 2012-08-31 2012-10-10 NaN
45 Q2 2012 2012-05-31 2012-07-10 NaN
46 Q1 2012 2012-02-29 2012-04-09 NaN
47 FY 2011 2011-11-30 2012-01-30 NaN
48 Q3 2011 2011-08-31 2011-10-11 NaN
49 Q2 2011 2011-05-31 2011-07-11 NaN
50 Q1 2011 2011-02-28 2011-04-11 NaN
51 Q3 2010 2010-08-31 2010-10-08 NaN

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working data
Projects
None yet
Development

No branches or pull requests

3 participants