blade
7b0c825a39
fix issue
2025-11-12 14:07:55 +08:00
blade
ea81197bcd
update for apply ALI QWEN as Demo
2025-11-11 13:33:57 +08:00
Blade He
37cf06a394
Confirm span pages calculation, the management fee and costs page only with management_fee_and_costs and management_fee datapoints
2025-04-03 18:08:27 -05:00
Blade He
f333cc30f5
1. fit the scenario when document type is not 1 or 4, 5
...
2. support the scenario:
"investment fees and costs including performance" statement in performance fee data page, instead of in management fee and costs data page.
2025-04-03 17:06:43 -05:00
Blade He
427a379b3b
1. support re-call ChatGPT API to match non-matched prediction fund/ share names
...
2. If document fund amount less than 3, cancel the production name judgment logic
2025-04-02 16:34:41 -05:00
Blade He
4cee95db9a
fix issue for post actions
2025-03-31 22:04:31 -05:00
Blade He
984c686bf3
support separate tables and pages data which with specific biz rules
2025-03-31 17:08:49 -05:00
Blade He
355b145cf7
If found total_annual_dollar_based_charges and could be divisible by 52 or 12,
...
then set the fund name and share name to be document production name
2025-03-28 01:33:33 -05:00
Blade He
8a5723c150
optimize for Entry Fee/ Nil Entry case
2025-03-27 21:10:33 -05:00
Blade He
d925992326
1. Support the keywords of complex special cases to be regex
...
2. Support set sub-datapoints list to complex special cases node.
3. Simplify the common management fee and costs instructions.
4. Add markdown title characters: ## or ### to instructions.
2025-03-27 16:00:19 -05:00
Blade He
ff2325c72d
1. fix issue for assign values based on production name
...
2. optimize instructions for extract non-necessary data by Cost of Product message
2025-03-26 18:58:45 -05:00
Blade He
4edc4b4768
clean code
2025-03-24 17:10:16 -05:00
Blade He
9be6d1296d
update benchmark check logic
2025-03-19 00:52:25 -05:00
Blade He
5ba39a394b
1. keep fund/ share db list before applying LLM
...
2. add key words for interposed_vehicle_performance_fee_cost
2025-03-18 22:15:31 -05:00
Blade He
c71936c5ff
1. optimize benchmark_name instructions
...
2. consider possible with multiple same raw fund names in documents, not to remove unmatched_db_list when match relevant raw fund/ share name
Otherwise, it will occur some raw names couldn't match db name issue.
2025-03-18 17:22:21 -05:00
Blade He
0cea2e501b
For AUS Prospectus, cancel visiting Vision ChatGPT when page contents without any numeric text or perhaps with messy code.
...
(But should keep this logic for EMEA LUX AR, because of some special providers cases for this market documents.)
2025-03-18 14:15:43 -05:00
Blade He
b3941ee4b3
update instructions for total_annual_dollar_based_charges
2025-03-17 15:07:02 -05:00
Blade He
dd15c1c48e
Optimize for benchmark name
2025-03-14 11:51:10 -05:00
Blade He
f539340d04
1. optimize instructions
...
Only load relevant fund name for investment objective, instead of full page text with the most recent investment objective
2. Exclude the table which with only one numeric column: Cost Product
2025-03-14 01:04:51 -05:00
Blade He
551f754379
Fix issue when saving data extraction data
2025-03-13 18:36:04 -05:00
Blade He
a48af9ddf0
A. Metrics score
...
Blade's updates
1. Set the secondary key to be the share class name, instead of the fund name
2. Remove the data point which support is 0 to calculate the metrics
3. Add the message list to store the error message
4. Support save metrics/ error message to excel file
5. Support statistics for different document list
6. Set F1-Score to the first column in the metrics table
B. Optimize instructions for benchmark_name
2025-03-13 17:52:06 -05:00
Blade He
c2c0b33015
align fund name based on production name
...
optimize performance relevant prompts
2025-03-12 21:52:00 -05:00
Blade He
6f17c2253c
optimize instructions for document 412778803
2025-03-12 17:24:39 -05:00
Blade He
765772e5a8
optimize performance_fee_costs by document 391080133
2025-03-12 14:45:48 -05:00
Blade He
c7c36dbdd2
1. update performance_fee name to performance_fee_costs
...
2. support extract data for total_annual_dollar_based_charges
2025-03-11 17:15:39 -05:00
Blade He
b7506c78f3
Add API code file
2025-03-10 16:00:17 -05:00
Blade He
e9f6383258
apply configuration file to replace disorder table header contents
2025-03-10 11:09:00 -05:00
Blade He
4ee762963e
optimized for management_fee_and_costs and administration_fees
2025-03-08 21:40:00 -06:00
Blade He
fa2dede454
optimize for management_fee_and_costs and management_fee
2025-03-07 18:38:36 -06:00
Blade He
2cd4f5f787
Supplement provider information to ground truth data
...
Calculate metrics based on providers
Integrate "merge" data algorithm for AUS Prospectus final outputs
2025-03-07 15:02:12 -06:00
Blade He
52515fc152
1. simplify management_fee_and_costs instructions
...
2. optimize management_fee_and_costs instructions
3. resolve the issues for complex scenarios: need sum management_fee, recoverable_expenses, indirect_costs as management_fee_and_costs
2025-03-06 17:27:18 -06:00
Blade He
c4ed65770d
Try to support more complex management_fee_and_costs scenarios
...
Support calculate all of data points metrics
2025-03-05 17:21:13 -06:00
Blade He
f4b4d00f58
optimize instructions for management fee and costs.
...
support dynamic loading complex instructions by keywords
2025-03-04 08:32:55 -06:00
Blade He
d3be711859
optimize administration fees instructions
2025-02-28 22:12:18 -06:00
Blade He
d4bc3aba4e
optimize for management fees
2025-02-28 16:55:33 -06:00
Blade He
d0295995d8
support judge whether next page contents with same structure table as current page.
...
If yes, handle next page data extraction pipeline.
2025-02-27 23:08:57 -06:00
Blade He
d0128d6279
1. optimize for administration fees.
...
2. optimize for management fees
2025-02-27 17:36:41 -06:00
Blade He
543cab74e1
1. get production name
...
2. if some data point with production name, set each fund/ share with relevant data point value(s)
2025-02-27 12:07:49 -06:00
Blade He
70079d176e
Support remove duplicated values to keep the values to be the latest ones.
2025-02-26 17:05:58 -06:00
Blade He
f467945cd4
support benchmark name data extraction
2025-02-26 10:05:46 -06:00
Blade He
357bb6d580
1. support dynamic show fund level data examples.
...
2. optimize for minimum_initial_investment data point
2025-02-25 10:35:53 -06:00
Blade He
75ea383354
support identify aus prospectus document category: MIS or Super
2025-02-24 15:08:15 -06:00
Blade He
bb6862b179
update a little
2025-02-19 14:32:08 -06:00
Blade He
705933bbdd
optimized for phase 2 data
2025-02-18 18:52:26 -06:00
Blade He
01e2a0e38d
add configuration for datapoints data types
...
update configuration for minimum initial investment
support apply value to all of funds for minimum initial investment
2025-02-05 12:08:12 -06:00
Blade He
a8810519f8
optimize instructions configuration
...
optimize drilldown part logic
2025-02-04 15:29:24 -06:00
Blade He
47c41e492f
1. only get name mapping data from document mapping
...
2. Compare name mapping metrics between Ravi's and mine.
2025-01-27 12:29:49 -06:00
Blade He
350550d1b0
fix issue for removing item from list
2025-01-21 17:24:05 -06:00
Blade He
e2b9bcbdbc
initial abbreviation configurations
2025-01-21 17:09:45 -06:00
Blade He
b15d260a58
migrate name mapping algorithm from Ravi
2025-01-21 16:55:08 -06:00