Basic implementation of WeightedCrossEntropy torchmetric. #1

alxmrs · 2025-01-31T00:45:11Z

[WIP]

eric-czech · 2025-01-31T14:38:32Z

models/dimamba.py

@@ -835,13 +836,47 @@ def forward(
        return hidden_states, all_hidden_states


+class WeightedCrossEntropy(LanguageCrossEntropy):


I think a better approach for this would be to subclass torchmetrics.Metric directly since it would then be portable across Lightning, Composer and Torchtitan (at the very least), without needing composer installed. This is also pretty much overriding everything LanguageCrossEntropy does, so I see little advantage to it.

eric-czech · 2025-01-31T14:47:06Z

requirements.yaml

@@ -12,6 +12,7 @@ dependencies:
  - pytorch-cuda=12.1
  - pip:
      - causal-conv1d==1.1.3.post1
+      - mosaicml


Let's remove this re: https://github.com/Open-Athena/mdlm/pull/1/files#r1937418877. It's great that Composer is designed to be decoupled from models like that. Torchtitan is too AFAIK. Lightning is not, and I'm not sure what that means yet for running lightning models like this on other training frameworks yet. Either way, there shouldn't be any need to depend on composer in mdlm.

Basic implementation of WeightedCrossEntropy torchmetric.

2ad9c26

[WIP]

eric-czech reviewed Jan 31, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Basic implementation of WeightedCrossEntropy torchmetric. #1

Basic implementation of WeightedCrossEntropy torchmetric. #1

alxmrs commented Jan 31, 2025

eric-czech Jan 31, 2025

eric-czech Jan 31, 2025

		@@ -835,13 +836,47 @@ def forward(
		return hidden_states, all_hidden_states


		class WeightedCrossEntropy(LanguageCrossEntropy):

Basic implementation of WeightedCrossEntropy torchmetric. #1

Are you sure you want to change the base?

Basic implementation of WeightedCrossEntropy torchmetric. #1

Conversation

alxmrs commented Jan 31, 2025

eric-czech Jan 31, 2025

Choose a reason for hiding this comment

eric-czech Jan 31, 2025

Choose a reason for hiding this comment