Blade He
f06355e0c8
optimize mapping algorithm: check whether exist "-" to connect share names
2024-10-02 11:38:11 -05:00
Blade He
035f028155
optimize mapping algorithm
2024-10-01 16:46:59 -05:00
Blade He
3adbd7631a
optimize mapping algorithm
2024-10-01 15:31:15 -05:00
Blade He
d92053a16e
optimize mapping metrics algorithm
2024-10-01 12:19:45 -05:00
Blade He
18174bf1cf
optimize mapping: choose proper candidates mapping list.
2024-10-01 11:35:29 -05:00
Blade He
60a26377e5
optimize investment mapping algorithm
2024-09-30 16:32:56 -05:00
Blade He
3aa596ea33
optimize mapping logic
2024-09-27 16:39:56 -05:00
Blade He
39cd53dc33
support calculate mapping metrics based on document investment mapping in database
2024-09-27 13:20:50 -05:00
Blade He
0c4c541319
optimize mapping algorithm, this is the fixed version to confirm mapping metrics
2024-09-27 09:25:11 -05:00
Blade He
7eba9a52ae
recover algorithm to the better version
2024-09-26 19:25:17 -05:00
Blade He
d25bae936c
Optimize investment mapping algorithm.
2024-09-26 12:18:37 -05:00
Blade He
598e2ab820
investment mapping: optimize for currency logic
2024-09-25 17:28:22 -05:00
Blade He
dd6701f18c
1. optimize investment mapping algorithm
...
2. realize investment mapping metrics
2024-09-25 15:15:38 -05:00
Blade He
0f14bf4a7a
1. get document/ provider mapping data
...
2. optimize metrics algorithm
3. Expand max token length since switch ChatGPT4o to 2024-08-06 version.
2024-09-23 17:21:02 -05:00
Blade He
8496c7b5ed
optimize instructions
...
optimize metrics algorithm
2024-09-20 16:46:44 -05:00
Blade He
91530d6089
add more description for Performance Fees calculation rules
2024-09-20 11:58:48 -05:00
Blade He
40bcce4404
instructions: explicitly announce, not to collect data which value with -, *, **, N/A, N/A%, N/A %, NONE
2024-09-20 10:26:18 -05:00
Blade He
c4985ac75f
optimize data extract, metrics calculation algorithm
2024-09-19 22:45:08 -05:00
Blade He
48dc8690c3
support extract data by pdf page image
2024-09-19 16:29:26 -05:00
Blade He
67371e534e
only calculate metrics for intersection document list
2024-09-19 11:54:51 -05:00
Blade He
27b3540c63
optimize metrics calculation algorithm
2024-09-19 11:44:17 -05:00
Blade He
98e86a6cfd
realize to calculate data extraction metrics.
2024-09-18 17:10:54 -05:00
Blade He
50e6c3c19d
a little change
2024-09-16 16:43:03 -05:00
Blade He
932870f406
support split text for this case: outputs over 4K tokens.
2024-09-16 12:03:13 -05:00
Blade He
0f6dbd27eb
optimize instructions for performance fees.
2024-09-13 16:10:44 -05:00
Blade He
e17414173a
update to get more precise results
2024-09-12 16:00:49 -05:00
Blade He
d56ac9482e
Adjust for output example format
2024-09-11 09:24:36 -05:00
Blade He
0887608719
support auto-mapping fund/ share by raw names.
2024-09-09 17:34:53 -05:00
Blade He
878383a72c
support extract the continuous page(s) for not missing next page data which without table header.
2024-09-06 16:29:35 -05:00
Blade He
1caf552065
support extract data by ChatGPT4o.
...
The instructions is generated dynamically.
2024-09-05 17:22:26 -05:00
Blade He
7c83f9152a
try to improve page filter precision
2024-09-04 17:01:12 -05:00
Blade He
7198450e53
support calculate page filter metrics.
2024-09-03 17:07:53 -05:00
Blade He
f81e2862f3
update prompts to extract TOR, OGC, TER, Performance fees data.
2024-08-30 16:37:00 -05:00
Blade He
63da030fe1
update general prompts
2024-08-29 17:05:58 -05:00
Blade He
134b365b68
Try to generate general prompts for LUX English AR
...
- Support output fund name ,share name, TER, performance fees, OGC
- Only output data point and value which can be found in page text.
- Output fund level data and share level data separately.
- List part of special cases to fit cases as many as possible.
2024-08-28 16:44:19 -05:00
Blade He
32676728f6
optimize prompts
2024-08-28 10:21:26 -05:00
Blade He
15720d8bfd
1. Text-and-image all in one chat function by ChatGPT4o
...
2. many experiments for extracting data by two ways:
page text or page image.
2024-08-26 17:17:39 -05:00
Blade He
843f588015
support chat with image by ChatGPT4o
2024-08-26 11:19:07 -05:00
Blade He
6519dc23d4
support filter pages by data point keywords
2024-08-23 16:38:11 -05:00
Blade He
993664cf78
a lot of functions to prepare data.
2024-08-22 10:37:56 -05:00
Blade He
f91e0cf1a8
auto-fix json data format
2024-08-19 17:59:32 -05:00
Blade He
fa46b45ad5
support output tables as markdown format from pdf documents
2024-08-19 15:49:45 -05:00
Blade He
424c30853c
initial
2024-08-19 09:52:13 -05:00