Blade He
843bbbd13f
dynamic loading instructions for multilingual.
2024-11-20 17:00:22 -06:00
Blade He
8223ca9a5c
a little change
2024-11-18 16:13:24 -06:00
Blade He
a42c0b5c2b
optimize retrieve fund instructions
2024-11-13 10:25:08 -06:00
Blade He
7a41b03634
1. optimize instructions for fund name
...
2. optimize drilldown logic
2024-11-12 17:01:10 -06:00
Blade He
c2d2e54670
"total match" logic for single word value, need consider the "\n" char scenario
2024-11-12 11:40:19 -06:00
Blade He
5b67bd332b
optimize drilldown algorithm
2024-11-12 11:20:38 -06:00
Blade He
c6c3e99d3e
integrate pdf drilldown logic to pdf_util.py
2024-11-11 16:34:25 -06:00
Blade He
c34e2e960e
optimize drilldown algorithm
2024-11-08 15:00:34 -06:00
Blade He
81f855f725
support drilldown data to PDF
2024-11-08 11:22:35 -06:00
Blade He
0349033eaf
update for more statistics methods
2024-11-06 16:39:42 -06:00
Blade He
81a424b00d
Support replaces share class name in database to be more readable.
...
Examples document 532422720
M&G European Credit Investment Fund A CHFH Acc -> M&G European Credit Investment Fund A CHF H Accumulation
M&G European Credit Investment Fund A CHFHInc -> M&G European Credit Investment Fund A CHF H Income
M&G European High Yield Credit Investment Fund E GBPHedgedAcc -> M&G European High Yield Credit Investment Fund E GBP Hedged Accumulation
2024-11-05 11:14:56 -06:00
Blade He
2645d528b1
support output data point reported name
2024-10-29 16:47:45 -05:00
Blade He
9d453c9fae
a little updates
2024-10-28 15:15:55 -05:00
Blade He
fa763f4f14
1. optimize instructions
...
2. optimize mapping algorithm
2024-10-24 16:24:21 -05:00
Blade He
53dadf61f4
optimize keywords/ instructions for special cases documents.
2024-10-23 16:56:43 -05:00
Blade He
171f3b6d1f
optimize for OGC data extraction.
2024-10-23 16:07:54 -05:00
Blade He
03365227b9
optimize instructions
2024-10-21 11:04:53 -05:00
Blade He
3f2bb38208
Resolve issue first records only with share class name but without fund name (in previous page text).
2024-10-16 16:55:32 -05:00
Blade He
f166e73362
optimize data extraction algorithm: if can't find cost numeric value from PDF page text, then extract data by Vision ChatGPT
2024-10-15 15:57:54 -05:00
Blade He
8b651f374c
optimize instructions
2024-10-14 09:12:05 -05:00
Blade He
df66489c5f
support this scenario: fund and share are with same name.
2024-10-11 13:14:04 -05:00
Blade He
92a26cd262
optimize configuration
2024-10-11 12:16:34 -05:00
Blade He
17284c74f0
optimize for investment mapping: share feature logic
2024-10-09 14:07:07 -05:00
Blade He
04a2409c58
optimize investment mapping algorithm
2024-10-08 23:53:55 -05:00
Blade He
aa2c2332ae
optimize for more cases
2024-10-08 17:16:01 -05:00
Blade He
8bd6008425
refactor code
2024-10-07 10:34:13 -05:00
Blade He
b18c48efeb
A little change
2024-10-03 16:31:16 -05:00
Blade He
f0dd7f9e89
Consider multiple share short names cases.
2024-10-02 17:25:25 -05:00
Blade He
edb90c718e
Optimize mapping algorithm
...
Consider some share class names are with multiple short name, e.g.
CPR Invest Global Disruptive Opportunities Class I sw EUR - Acc
The short names are I and sw
The purpose is to support get all of short names from share class name.
2024-10-02 15:08:26 -05:00
Blade He
3bb13947af
Optimize mapping algorithm:
...
For multiple currencies in fund/ share name, if exist USD, remove it
Fix the issue for split words without space
If there is no currency in share class name, try to get same currency from document mapping which with same fund name and same short share class name.
2024-10-02 13:25:08 -05:00
Blade He
f06355e0c8
optimize mapping algorithm: check whether exist "-" to connect share names
2024-10-02 11:38:11 -05:00
Blade He
035f028155
optimize mapping algorithm
2024-10-01 16:46:59 -05:00
Blade He
3adbd7631a
optimize mapping algorithm
2024-10-01 15:31:15 -05:00
Blade He
d92053a16e
optimize mapping metrics algorithm
2024-10-01 12:19:45 -05:00
Blade He
18174bf1cf
optimize mapping: choose proper candidates mapping list.
2024-10-01 11:35:29 -05:00
Blade He
60a26377e5
optimize investment mapping algorithm
2024-09-30 16:32:56 -05:00
Blade He
3aa596ea33
optimize mapping logic
2024-09-27 16:39:56 -05:00
Blade He
39cd53dc33
support calculate mapping metrics based on document investment mapping in database
2024-09-27 13:20:50 -05:00
Blade He
0c4c541319
optimize mapping algorithm, this is the fixed version to confirm mapping metrics
2024-09-27 09:25:11 -05:00
Blade He
7eba9a52ae
recover algorithm to the better version
2024-09-26 19:25:17 -05:00
Blade He
d25bae936c
Optimize investment mapping algorithm.
2024-09-26 12:18:37 -05:00
Blade He
598e2ab820
investment mapping: optimize for currency logic
2024-09-25 17:28:22 -05:00
Blade He
dd6701f18c
1. optimize investment mapping algorithm
...
2. realize investment mapping metrics
2024-09-25 15:15:38 -05:00
Blade He
0f14bf4a7a
1. get document/ provider mapping data
...
2. optimize metrics algorithm
3. Expand max token length since switch ChatGPT4o to 2024-08-06 version.
2024-09-23 17:21:02 -05:00
Blade He
8496c7b5ed
optimize instructions
...
optimize metrics algorithm
2024-09-20 16:46:44 -05:00
Blade He
91530d6089
add more description for Performance Fees calculation rules
2024-09-20 11:58:48 -05:00
Blade He
40bcce4404
instructions: explicitly announce, not to collect data which value with -, *, **, N/A, N/A%, N/A %, NONE
2024-09-20 10:26:18 -05:00
Blade He
c4985ac75f
optimize data extract, metrics calculation algorithm
2024-09-19 22:45:08 -05:00
Blade He
48dc8690c3
support extract data by pdf page image
2024-09-19 16:29:26 -05:00
Blade He
67371e534e
only calculate metrics for intersection document list
2024-09-19 11:54:51 -05:00
Blade He
27b3540c63
optimize metrics calculation algorithm
2024-09-19 11:44:17 -05:00
Blade He
98e86a6cfd
realize to calculate data extraction metrics.
2024-09-18 17:10:54 -05:00
Blade He
50e6c3c19d
a little change
2024-09-16 16:43:03 -05:00
Blade He
932870f406
support split text for this case: outputs over 4K tokens.
2024-09-16 12:03:13 -05:00
Blade He
0f6dbd27eb
optimize instructions for performance fees.
2024-09-13 16:10:44 -05:00
Blade He
e17414173a
update to get more precise results
2024-09-12 16:00:49 -05:00
Blade He
0887608719
support auto-mapping fund/ share by raw names.
2024-09-09 17:34:53 -05:00
Blade He
878383a72c
support extract the continuous page(s) for not missing next page data which without table header.
2024-09-06 16:29:35 -05:00
Blade He
1caf552065
support extract data by ChatGPT4o.
...
The instructions is generated dynamically.
2024-09-05 17:22:26 -05:00
Blade He
7198450e53
support calculate page filter metrics.
2024-09-03 17:07:53 -05:00
Blade He
6519dc23d4
support filter pages by data point keywords
2024-08-23 16:38:11 -05:00
Blade He
fa46b45ad5
support output tables as markdown format from pdf documents
2024-08-19 15:49:45 -05:00