Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

some problems of get_graph.py #12

Open
hbuwls opened this issue Jun 12, 2024 · 10 comments
Open

some problems of get_graph.py #12

hbuwls opened this issue Jun 12, 2024 · 10 comments

Comments

@hbuwls
Copy link

hbuwls commented Jun 12, 2024

Hello, I am a graduate student in a university. After reading your paper, I am impressed that your articles and codes are very excellent, and I want to try to learn some knowledge of drawing construction. However, due to personal ability, there are some problems in the step of get.graph.py. I would like to ask you about the./data/biomedical_data/normal_list.txt file in get_graph.py and data/biomedical_data/normal_list_BRCA.txt and data/clinical _data/staging.txt file is the name sequence file of the data set, Other in configs/GraphConstruction/BRCA_HovernetKimia_graph_constructor yml file hovernet_data_root refers to what path, I'm very sorry my my problem is a bit much, But I am very eager to study under your guidance. I hope you can spare your precious time to help me answer my questions. Thank you very much.

@howardchanth
Copy link
Collaborator

Hi thanks for your appreciation of our work. The lists file refer to the labels that we refer to them foe the labels when constructing the graph - such that we can perform five folded CV with balanced classes. Then hovernet root refers to the pre-trained hovernet weights, where you can download the weights from the original repo of Hovernet.

@hbuwls
Copy link
Author

hbuwls commented Jun 19, 2024

Thank you very much for your guidance, I will try my best, thank you for taking the precious time to reply me, thank you very much

@hbuwls
Copy link
Author

hbuwls commented Jun 30, 2024

Thank you very much for your help before, but I am very sorry that I may still have some questions that I do not understand. Since I only know a little about the TCGA data set, I would like to ask you again. If I want to train the cancer classfication task of the BRCA data set, what should the root directory format of the tcga data set be like? Because it is a classification task, the BRCA_trainval function is called. Therefore, the format of the normal_list_BRCA.txt file is TCGA-C8-A1H1,normal. I am sorry that I do not know whether this is correct. In addition, type_info_path in hovernet_config in BRCA_HovernetKimianet_graph_constructor.yml file is the path to store what file, I can understand it as json file, but I don't know its specific content. In addition, for some questions you helped me last time, does hovernet_data_root in graph_constructor refer to the weight path of hover? I don't know much about this, so there is a setting indicating hovernet in the following hovernet_config. Can I understand that this is the root folder for training hovernet? I am very sorry to bother you again. I love and appreciate your work very much and look forward to your advice. I look forward to your reply! I wish you a happy life and smooth work!

@howardchanth howardchanth reopened this Jul 6, 2024
@howardchanth
Copy link
Collaborator

Hi sorry for the late reply. The files in ./data present examples on the normal_list. Specifically it's in the format TCGA-C8-A1H1, {label_info}. You can let it to be TCGA-C8-A1H1,normal but you need to address this format when loading the data in data.py. Please refer to #10 #9 for more information. It seems that type_info.json is deprecated and no longer in use. Please ignore this for now and let me know if there is any further problem. Yes hovernet_data_root: "./data/hovernet_json" pointing the directory storing the weights of HoverNet. Thanks again for using our work and please let us know if there is any further issue.

@hbuwls
Copy link
Author

hbuwls commented Aug 3, 2024

Hello, I'm sorry to ask you some questions again. I want to learn more about the difference between GraphConstruction, configs, and BRCA, COAD, and ESCA. In addition, I would like to know the difference between HovernetEfficient and HovernetKimia in GraphConstruction. As mentioned in the paper, HovernetKimia refers to the two networks of hovernet and kimianet, while HovernetEfficient is just a kind of network of Hovernet? I am very sorry to bother you again. I am very grateful to you for your help before. I hope your team will get better and better.

@howardchanth
Copy link
Collaborator

Hi. HovernetEfficient is our another experiments using Efficientnet b4. Since the pretrained Efficientnet is on natural images, we found the performance (on encoding patch features) is not as good as using the KimiaNet. Sorry that I don't understand your first question. Maybe you could make it a bit more clear what difference you want to learn. Thanks

@hbuwls
Copy link
Author

hbuwls commented Aug 16, 2024

First of all, thank you very much for your reply. Secondly, I would like to apologize to you for my unclear expression, which not only wasted your time, but also came to nothing. The question I wanted to ask you last time was really simple, I'd like you to walk me through the feature_dim, radius, n_channel, verbose parameters in the configs\GraphConstruction\BRCA_HovernetKimia_graph_constructor.yml file Count. I am very sorry that I did not express myself clearly last time, and thank you for your continuous advice.

@howardchanth
Copy link
Collaborator

The feature_dim is the dimension of each patch, where the feature encoder compress the patch into this dimension, and the features would serve as the graph feature. The radius in the number of neighbours for KNN, which is called in the following codes.

image

n_channel is the number of channel, which is 3 for colored images and 1 for black and white images. Verbose is a boolean variable to determine whether we print the debugging messages.

@hbuwls
Copy link
Author

hbuwls commented Aug 30, 2024

Good evening and thank you for your reply. In the previous days, I tried to replicate your whole experiment process with the camelyon16 dataset, and after trying, I can now construct the diagram successfully. After reading your paper and the project, I kept trying to run train.py until I had a few more questions here. First, I would like to ask about the details in BRCA/HEAT4_kimia_classification_v2.yml:
train_path: "./data/BRCA_kimia_lv0/5fold_balanced/fold_4/train.txt"
eval_path: "./data/BRCA_kimia_lv0/5fold_balanced/fold_4/test.txt"
valid_path: "./data/BRCA_kimia_lv0/5fold_balanced/fold_4/val.txt"
These three paths are text files that store training set verification set and test set, but if I use camelyon16 for verification, there will be no these files. In addition, after I successfully build the drawing for the first time, checkpoints/HEAT4_BRCA_Kimia_lv0_balanced_cls_f4 files do generate checkpoints/ Heat4_brCA_KIMIa_LV0_Balanced_CLS_F4 files, but not during subsequent rebuilds.
In addition, I would like to ask you whether it is OK to reproduce camelyon16 data set. If not, what is the difference between Camelyon16 data set and other data sets such as TCGA-BRCA in the reproduction experiment?
Thank you very much for your generous advice.

@howardchanth
Copy link
Collaborator

Hi sorry for the late reply. I think it is possible to reproduce the method on Camelyon16. You can refer to the functions in get_graph.py (to generate the label files in train_path/eval_path etc.) and define your own dataset at data.py, and you should obtain the heterogeneous graphs and train it with the standard pipeline.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants