Blade He
604ab326a7
a little change
2025-03-08 21:50:44 -06:00
Blade He
4ee762963e
optimized for management_fee_and_costs and administration_fees
2025-03-08 21:40:00 -06:00
Blade He
fa2dede454
optimize for management_fee_and_costs and management_fee
2025-03-07 18:38:36 -06:00
Blade He
2cd4f5f787
Supplement provider information to ground truth data
...
Calculate metrics based on providers
Integrate "merge" data algorithm for AUS Prospectus final outputs
2025-03-07 15:02:12 -06:00
Blade He
52515fc152
1. simplify management_fee_and_costs instructions
...
2. optimize management_fee_and_costs instructions
3. resolve the issues for complex scenarios: need sum management_fee, recoverable_expenses, indirect_costs as management_fee_and_costs
2025-03-06 17:27:18 -06:00
Blade He
c4ed65770d
Try to support more complex management_fee_and_costs scenarios
...
Support calculate all of data points metrics
2025-03-05 17:21:13 -06:00
Blade He
cd7e09757d
check in calc_metrics to repo.
2025-03-05 09:57:02 -06:00
Blade He
d00820c14d
update AUS Prospectus data point configurations
2025-03-04 16:52:06 -06:00
Blade He
f4b4d00f58
optimize instructions for management fee and costs.
...
support dynamic loading complex instructions by keywords
2025-03-04 08:32:55 -06:00
Blade He
d3be711859
optimize administration fees instructions
2025-02-28 22:12:18 -06:00
Blade He
d4bc3aba4e
optimize for management fees
2025-02-28 16:55:33 -06:00
Blade He
d0295995d8
support judge whether next page contents with same structure table as current page.
...
If yes, handle next page data extraction pipeline.
2025-02-27 23:08:57 -06:00
Blade He
d0128d6279
1. optimize for administration fees.
...
2. optimize for management fees
2025-02-27 17:36:41 -06:00
Blade He
543cab74e1
1. get production name
...
2. if some data point with production name, set each fund/ share with relevant data point value(s)
2025-02-27 12:07:49 -06:00
Blade He
412692e1c4
update keywords for management fee and costs
2025-02-27 08:34:46 -06:00
Blade He
70079d176e
Support remove duplicated values to keep the values to be the latest ones.
2025-02-26 17:05:58 -06:00
Blade He
f467945cd4
support benchmark name data extraction
2025-02-26 10:05:46 -06:00
Blade He
357bb6d580
1. support dynamic show fund level data examples.
...
2. optimize for minimum_initial_investment data point
2025-02-25 10:35:53 -06:00
Blade He
e60e1fd546
move configuration files for all datapoints to "all_datapoints" folder
2025-02-24 15:23:16 -06:00
Blade He
590f7e2249
1. backup data points configurations
...
2. simplify data points configurations for important 11 data points.
2025-02-24 15:21:32 -06:00
Blade He
75ea383354
support identify aus prospectus document category: MIS or Super
2025-02-24 15:08:15 -06:00
Blade He
bb6862b179
update a little
2025-02-19 14:32:08 -06:00
Blade He
705933bbdd
optimized for phase 2 data
2025-02-18 18:52:26 -06:00
Blade He
353bc28599
update a little
2025-02-11 11:49:53 -06:00
Blade He
01e2a0e38d
add configuration for datapoints data types
...
update configuration for minimum initial investment
support apply value to all of funds for minimum initial investment
2025-02-05 12:08:12 -06:00
Blade He
a8810519f8
optimize instructions configuration
...
optimize drilldown part logic
2025-02-04 15:29:24 -06:00
Blade He
f9ef4cec96
update sql_query cache file store location
...
At most cache 5 days, then clean from local disk.
2025-01-31 10:59:54 -06:00
Blade He
7f37f3532f
switch example document
2025-01-27 14:59:26 -06:00
Blade He
6f831e241c
Merge branch 'aus_prospectus_ravi'
2025-01-27 12:32:42 -06:00
Blade He
41f8c307ff
a little change
2025-01-27 12:32:36 -06:00
Blade He
47c41e492f
1. only get name mapping data from document mapping
...
2. Compare name mapping metrics between Ravi's and mine.
2025-01-27 12:29:49 -06:00
Blade He
d9b0bed39a
a little change
2025-01-22 09:57:42 -06:00
Blade He
350550d1b0
fix issue for removing item from list
2025-01-21 17:24:05 -06:00
Blade He
e2b9bcbdbc
initial abbreviation configurations
2025-01-21 17:09:45 -06:00
Blade He
b15d260a58
migrate name mapping algorithm from Ravi
2025-01-21 16:55:08 -06:00
Blade He
d41fae3dba
prepare for 100 multi-funds document samples
2025-01-17 16:26:31 -06:00
Blade He
b93a8d55e8
update for output data as template
2025-01-17 11:41:58 -06:00
Blade He
f10ff8ee33
update for deployment
2025-01-16 20:34:43 -06:00
Blade He
fb4a6402f0
support output merged data format
2025-01-16 16:31:04 -06:00
Blade He
2eace81f51
support more configurable parts
2025-01-16 13:54:45 -06:00
Blade He
db0827435b
supplement EMEA AR configuration files
2025-01-16 11:30:44 -06:00
Blade He
9f0e77a11e
support load configurations by doc_source parameter
2025-01-16 11:17:48 -06:00
Blade He
acc30d4b72
if fail to get text by pdf to html API, then try to get text by pymupdf.
2025-01-15 18:36:02 -06:00
Blade He
ace0ac2674
a little change
2025-01-15 18:22:08 -06:00
Blade He
a89aa9c4de
support fetch data from Prospectus
2025-01-14 16:21:48 -06:00
Blade He
e230a5bf15
a little change
2025-01-09 12:19:24 -06:00
Blade He
91c86bb983
update AUS Prospectus relevant configuration
2025-01-08 17:40:57 -06:00
Blade He
0a867dcf07
complete configuration for AUS Prospectus
2025-01-07 16:25:13 -06:00
Blade He
201a809ffa
comment remove_abundant_data function
2025-01-06 15:27:43 -06:00
Blade He
c335992ced
update requirements.txt
2025-01-06 13:56:09 -06:00