Blade He
75ea5e70de
1. support fetch data from messy-code page by ChatGPT4o Vision function.
...
2. multilingual share features configuration
2024-12-09 17:47:42 -06:00
Blade He
d96f77fe00
Split share class names which with multiple share classes in same line
2024-12-06 16:31:42 -06:00
Blade He
a25991e2bb
1. Set TOR reported name priority
...
2. Optimize investment mapping logic
2024-12-06 09:54:43 -06:00
Blade He
95c386911c
Clean fund name after getting response from ChatGPT
2024-12-04 22:08:09 -06:00
Blade He
70362b554f
Fix issue for "The last fund name of previous PDF page" logic:
...
If current page fund name starts with "The last fund name of previous PDF page" and with more contents below, then remove "The last fund name of previous PDF page".
2024-12-04 16:57:52 -06:00
Blade He
36fbaa946e
Add the statement when transferring the last fund name of previous PDF page:
...
The last fund name of previous PDF page:
page_text = f"\nThe last fund name of previous PDF page: {previous_page_fund_name}\n{page_text}"
2024-12-03 11:50:31 -06:00
Blade He
a11a99fdc3
1. Optimize instructions: not to fetch the data with "up to" statement.
...
2. Add exception handler in function.
2024-12-03 11:27:28 -06:00
Blade He
bc32860f87
remove_abundant_data
2024-12-02 17:16:56 -06:00
Blade He
843bbbd13f
dynamic loading instructions for multilingual.
2024-11-20 17:00:22 -06:00
Blade He
2645d528b1
support output data point reported name
2024-10-29 16:47:45 -05:00
Blade He
9d453c9fae
a little updates
2024-10-28 15:15:55 -05:00
Blade He
3f2bb38208
Resolve issue first records only with share class name but without fund name (in previous page text).
2024-10-16 16:55:32 -05:00
Blade He
f166e73362
optimize data extraction algorithm: if can't find cost numeric value from PDF page text, then extract data by Vision ChatGPT
2024-10-15 15:57:54 -05:00
Blade He
df66489c5f
support this scenario: fund and share are with same name.
2024-10-11 13:14:04 -05:00
Blade He
17284c74f0
optimize for investment mapping: share feature logic
2024-10-09 14:07:07 -05:00
Blade He
04a2409c58
optimize investment mapping algorithm
2024-10-08 23:53:55 -05:00
Blade He
aa2c2332ae
optimize for more cases
2024-10-08 17:16:01 -05:00
Blade He
d92053a16e
optimize mapping metrics algorithm
2024-10-01 12:19:45 -05:00
Blade He
18174bf1cf
optimize mapping: choose proper candidates mapping list.
2024-10-01 11:35:29 -05:00
Blade He
60a26377e5
optimize investment mapping algorithm
2024-09-30 16:32:56 -05:00
Blade He
3aa596ea33
optimize mapping logic
2024-09-27 16:39:56 -05:00
Blade He
39cd53dc33
support calculate mapping metrics based on document investment mapping in database
2024-09-27 13:20:50 -05:00
Blade He
598e2ab820
investment mapping: optimize for currency logic
2024-09-25 17:28:22 -05:00
Blade He
dd6701f18c
1. optimize investment mapping algorithm
...
2. realize investment mapping metrics
2024-09-25 15:15:38 -05:00
Blade He
0f14bf4a7a
1. get document/ provider mapping data
...
2. optimize metrics algorithm
3. Expand max token length since switch ChatGPT4o to 2024-08-06 version.
2024-09-23 17:21:02 -05:00
Blade He
8496c7b5ed
optimize instructions
...
optimize metrics algorithm
2024-09-20 16:46:44 -05:00
Blade He
91530d6089
add more description for Performance Fees calculation rules
2024-09-20 11:58:48 -05:00
Blade He
c4985ac75f
optimize data extract, metrics calculation algorithm
2024-09-19 22:45:08 -05:00
Blade He
48dc8690c3
support extract data by pdf page image
2024-09-19 16:29:26 -05:00
Blade He
67371e534e
only calculate metrics for intersection document list
2024-09-19 11:54:51 -05:00
Blade He
27b3540c63
optimize metrics calculation algorithm
2024-09-19 11:44:17 -05:00
Blade He
98e86a6cfd
realize to calculate data extraction metrics.
2024-09-18 17:10:54 -05:00
Blade He
50e6c3c19d
a little change
2024-09-16 16:43:03 -05:00
Blade He
932870f406
support split text for this case: outputs over 4K tokens.
2024-09-16 12:03:13 -05:00
Blade He
e17414173a
update to get more precise results
2024-09-12 16:00:49 -05:00
Blade He
0887608719
support auto-mapping fund/ share by raw names.
2024-09-09 17:34:53 -05:00
Blade He
878383a72c
support extract the continuous page(s) for not missing next page data which without table header.
2024-09-06 16:29:35 -05:00
Blade He
1caf552065
support extract data by ChatGPT4o.
...
The instructions is generated dynamically.
2024-09-05 17:22:26 -05:00
Blade He
7c83f9152a
try to improve page filter precision
2024-09-04 17:01:12 -05:00
Blade He
7198450e53
support calculate page filter metrics.
2024-09-03 17:07:53 -05:00
Blade He
32676728f6
optimize prompts
2024-08-28 10:21:26 -05:00
Blade He
6519dc23d4
support filter pages by data point keywords
2024-08-23 16:38:11 -05:00