Commit Graph

25 Commits

Author SHA1 Message Date
Blade He a25991e2bb 1. Set TOR reported name priority
2. Optimize investment mapping logic
2024-12-06 09:54:43 -06:00
Blade He 95c386911c Clean fund name after getting response from ChatGPT 2024-12-04 22:08:09 -06:00
Blade He 70362b554f Fix issue for "The last fund name of previous PDF page" logic:
If current page fund name starts with "The last fund name of previous PDF page" and with more contents below, then remove "The last fund name of previous PDF page".
2024-12-04 16:57:52 -06:00
Blade He 36fbaa946e Add the statement when transferring the last fund name of previous PDF page:
The last fund name of previous PDF page:
page_text = f"\nThe last fund name of previous PDF page: {previous_page_fund_name}\n{page_text}"
2024-12-03 11:50:31 -06:00
Blade He a11a99fdc3 1. Optimize instructions: not to fetch the data with "up to" statement.
2. Add exception handler in function.
2024-12-03 11:27:28 -06:00
Blade He bc32860f87 remove_abundant_data 2024-12-02 17:16:56 -06:00
Blade He 843bbbd13f dynamic loading instructions for multilingual. 2024-11-20 17:00:22 -06:00
Blade He 2645d528b1 support output data point reported name 2024-10-29 16:47:45 -05:00
Blade He 3f2bb38208 Resolve issue first records only with share class name but without fund name (in previous page text). 2024-10-16 16:55:32 -05:00
Blade He f166e73362 optimize data extraction algorithm: if can't find cost numeric value from PDF page text, then extract data by Vision ChatGPT 2024-10-15 15:57:54 -05:00
Blade He df66489c5f support this scenario: fund and share are with same name. 2024-10-11 13:14:04 -05:00
Blade He 17284c74f0 optimize for investment mapping: share feature logic 2024-10-09 14:07:07 -05:00
Blade He 3aa596ea33 optimize mapping logic 2024-09-27 16:39:56 -05:00
Blade He 0f14bf4a7a 1. get document/ provider mapping data
2. optimize metrics algorithm
3. Expand max token length since switch ChatGPT4o to 2024-08-06 version.
2024-09-23 17:21:02 -05:00
Blade He 8496c7b5ed optimize instructions
optimize metrics algorithm
2024-09-20 16:46:44 -05:00
Blade He 91530d6089 add more description for Performance Fees calculation rules 2024-09-20 11:58:48 -05:00
Blade He c4985ac75f optimize data extract, metrics calculation algorithm 2024-09-19 22:45:08 -05:00
Blade He 48dc8690c3 support extract data by pdf page image 2024-09-19 16:29:26 -05:00
Blade He 27b3540c63 optimize metrics calculation algorithm 2024-09-19 11:44:17 -05:00
Blade He 50e6c3c19d a little change 2024-09-16 16:43:03 -05:00
Blade He 932870f406 support split text for this case: outputs over 4K tokens. 2024-09-16 12:03:13 -05:00
Blade He e17414173a update to get more precise results 2024-09-12 16:00:49 -05:00
Blade He 0887608719 support auto-mapping fund/ share by raw names. 2024-09-09 17:34:53 -05:00
Blade He 878383a72c support extract the continuous page(s) for not missing next page data which without table header. 2024-09-06 16:29:35 -05:00
Blade He 1caf552065 support extract data by ChatGPT4o.
The instructions is generated dynamically.
2024-09-05 17:22:26 -05:00