Blade He
|
d00820c14d
|
update AUS Prospectus data point configurations
|
2025-03-04 16:52:06 -06:00 |
Blade He
|
f4b4d00f58
|
optimize instructions for management fee and costs.
support dynamic loading complex instructions by keywords
|
2025-03-04 08:32:55 -06:00 |
Blade He
|
d3be711859
|
optimize administration fees instructions
|
2025-02-28 22:12:18 -06:00 |
Blade He
|
d4bc3aba4e
|
optimize for management fees
|
2025-02-28 16:55:33 -06:00 |
Blade He
|
d0295995d8
|
support judge whether next page contents with same structure table as current page.
If yes, handle next page data extraction pipeline.
|
2025-02-27 23:08:57 -06:00 |
Blade He
|
d0128d6279
|
1. optimize for administration fees.
2. optimize for management fees
|
2025-02-27 17:36:41 -06:00 |
Blade He
|
543cab74e1
|
1. get production name
2. if some data point with production name, set each fund/ share with relevant data point value(s)
|
2025-02-27 12:07:49 -06:00 |
Blade He
|
412692e1c4
|
update keywords for management fee and costs
|
2025-02-27 08:34:46 -06:00 |
Blade He
|
70079d176e
|
Support remove duplicated values to keep the values to be the latest ones.
|
2025-02-26 17:05:58 -06:00 |
Blade He
|
f467945cd4
|
support benchmark name data extraction
|
2025-02-26 10:05:46 -06:00 |
Blade He
|
357bb6d580
|
1. support dynamic show fund level data examples.
2. optimize for minimum_initial_investment data point
|
2025-02-25 10:35:53 -06:00 |
Blade He
|
e60e1fd546
|
move configuration files for all datapoints to "all_datapoints" folder
|
2025-02-24 15:23:16 -06:00 |
Blade He
|
590f7e2249
|
1. backup data points configurations
2. simplify data points configurations for important 11 data points.
|
2025-02-24 15:21:32 -06:00 |
Blade He
|
75ea383354
|
support identify aus prospectus document category: MIS or Super
|
2025-02-24 15:08:15 -06:00 |
Blade He
|
bb6862b179
|
update a little
|
2025-02-19 14:32:08 -06:00 |
Blade He
|
705933bbdd
|
optimized for phase 2 data
|
2025-02-18 18:52:26 -06:00 |
Blade He
|
353bc28599
|
update a little
|
2025-02-11 11:49:53 -06:00 |
Blade He
|
01e2a0e38d
|
add configuration for datapoints data types
update configuration for minimum initial investment
support apply value to all of funds for minimum initial investment
|
2025-02-05 12:08:12 -06:00 |
Blade He
|
a8810519f8
|
optimize instructions configuration
optimize drilldown part logic
|
2025-02-04 15:29:24 -06:00 |
Blade He
|
f9ef4cec96
|
update sql_query cache file store location
At most cache 5 days, then clean from local disk.
|
2025-01-31 10:59:54 -06:00 |
Blade He
|
7f37f3532f
|
switch example document
|
2025-01-27 14:59:26 -06:00 |
Blade He
|
6f831e241c
|
Merge branch 'aus_prospectus_ravi'
|
2025-01-27 12:32:42 -06:00 |
Blade He
|
41f8c307ff
|
a little change
|
2025-01-27 12:32:36 -06:00 |
Blade He
|
47c41e492f
|
1. only get name mapping data from document mapping
2. Compare name mapping metrics between Ravi's and mine.
|
2025-01-27 12:29:49 -06:00 |
Blade He
|
d9b0bed39a
|
a little change
|
2025-01-22 09:57:42 -06:00 |
Blade He
|
350550d1b0
|
fix issue for removing item from list
|
2025-01-21 17:24:05 -06:00 |
Blade He
|
e2b9bcbdbc
|
initial abbreviation configurations
|
2025-01-21 17:09:45 -06:00 |
Blade He
|
b15d260a58
|
migrate name mapping algorithm from Ravi
|
2025-01-21 16:55:08 -06:00 |
Blade He
|
d41fae3dba
|
prepare for 100 multi-funds document samples
|
2025-01-17 16:26:31 -06:00 |
Blade He
|
b93a8d55e8
|
update for output data as template
|
2025-01-17 11:41:58 -06:00 |
Blade He
|
f10ff8ee33
|
update for deployment
|
2025-01-16 20:34:43 -06:00 |
Blade He
|
fb4a6402f0
|
support output merged data format
|
2025-01-16 16:31:04 -06:00 |
Blade He
|
2eace81f51
|
support more configurable parts
|
2025-01-16 13:54:45 -06:00 |
Blade He
|
db0827435b
|
supplement EMEA AR configuration files
|
2025-01-16 11:30:44 -06:00 |
Blade He
|
9f0e77a11e
|
support load configurations by doc_source parameter
|
2025-01-16 11:17:48 -06:00 |
Blade He
|
acc30d4b72
|
if fail to get text by pdf to html API, then try to get text by pymupdf.
|
2025-01-15 18:36:02 -06:00 |
Blade He
|
ace0ac2674
|
a little change
|
2025-01-15 18:22:08 -06:00 |
Blade He
|
a89aa9c4de
|
support fetch data from Prospectus
|
2025-01-14 16:21:48 -06:00 |
Blade He
|
e230a5bf15
|
a little change
|
2025-01-09 12:19:24 -06:00 |
Blade He
|
91c86bb983
|
update AUS Prospectus relevant configuration
|
2025-01-08 17:40:57 -06:00 |
Blade He
|
0a867dcf07
|
complete configuration for AUS Prospectus
|
2025-01-07 16:25:13 -06:00 |
Blade He
|
201a809ffa
|
comment remove_abundant_data function
|
2025-01-06 15:27:43 -06:00 |
Blade He
|
c335992ced
|
update requirements.txt
|
2025-01-06 13:56:09 -06:00 |
Blade He
|
9348e32caa
|
support more performance fee keywords
|
2025-01-06 13:14:20 -06:00 |
Blade He
|
65e752e25a
|
realize merge_output_data function, whether to output as this format, depends on confirmation with data/ developer teams
|
2024-12-18 09:19:55 -06:00 |
Blade He
|
309bb714f6
|
fix issue for parsing data via Vision Function.
|
2024-12-11 16:49:04 -06:00 |
Blade He
|
d673a99e21
|
switch back to extract data from image stream directly, instead of getting text from image stream as the first step, then extract data from extracted text.
The reason is: the quality of getting text from image steam is not good enough.
|
2024-12-10 16:17:47 -06:00 |
Blade He
|
f71e2968cc
|
simplify code
|
2024-12-09 22:24:40 -06:00 |
Blade He
|
75ea5e70de
|
1. support fetch data from messy-code page by ChatGPT4o Vision function.
2. multilingual share features configuration
|
2024-12-09 17:47:42 -06:00 |
Blade He
|
d96f77fe00
|
Split share class names which with multiple share classes in same line
|
2024-12-06 16:31:42 -06:00 |