Vulnerability History
| Date | High Risk | Low Risk | 
|---|---|---|
| 2024-11-11 | 2 | 1 | 
Audit Report Details
18500
      Lines of Code
    5
      Open
    0
      Resolved
    🚨 High Risk Vulnerabilities
⚠️ Low Risk Vulnerabilities
Vulnerable Code:
1---
2File: /contrib/CODE_REVIEW_DOCS.md
3---
4
5# Code Review
6### Conceptual Review
7
8A review can be a conceptual review, where the reviewer leaves a comment
9 * `Concept (N)ACK`, meaning "I do (not) agree with the general goal of this pull
10   request",
11 * `Approach (N)ACK`, meaning `Concept ACK`, but "I do (not) agree with the
12   approach of this change".
13
14A `NACK` needs to include a rationale why the change is not worthwhile.
15NACKs without accompanying reasoning may be disregarded.
16After conceptual agreement on the change, code review can be provided. A review
17begins with `ACK BRANCH_COMMIT`, where `BRANCH_COMMIT` is the top of the PR
18branch, followed by a description of how the reviewer did the review. The
19following language is used within pull request comments:
20
21  - "I have tested the code", involving change-specific manual testing in
22    addition to running the unit, functional, or fuzz tests, and in case it is
23    not obvious how the manual testing was done, it should be described;
24  - "I have not tested the code, but I have reviewed it and it looks
25    OK, I agree it can be merged";
26  - A "nit" refers to a trivial, often non-blocking issue.
27
28### Code Review
29Project maintainers reserve the right to weigh the opinions of peer reviewers
30using common sense judgement and may also weigh based on merit. Reviewers that
31have demonstrated a deeper commitment and understanding of the project over time
32or who have clear domain expertise may naturally have more weight, as one would
33expect in all walks of life.
34
35Where a patch set affects consensus-critical code, the bar will be much
36higher in terms of discussion and peer review requirements, keeping in mind that
37mistakes could be very costly to the wider community. This includes refactoring
38of consensus-critical code.
39
40Where a patch set proposes to change the Bittensor consensus, it must have been
41discussed extensively on the discord server and other channels, be accompanied by a widely
42discussed BIP and have a generally widely perceived technical consensus of being
43a worthwhile change based on the judgement of the maintainers.
44
45### Finding Reviewers
46
47As most reviewers are themselves developers with their own projects, the review
48process can be quite lengthy, and some amount of patience is required. If you find
49that you've been waiting for a pull request to be given attention for several
50months, there may be a number of reasons for this, some of which you can do something
51about:
52
53  - It may be because of a feature freeze due to an upcoming release. During this time,
54    only bug fixes are taken into consideration. If your pull request is a new feature,
55    it will not be prioritized until after the release. Wait for the release.
56  - It may be because the changes you are suggesting do not appeal to people. Rather than
57    nits and critique, which require effort and means they care enough to spend time on your
58    contribution, thundering silence is a good sign of widespread (mild) dislike of a given change
59    (because people don't assume *others* won't actually like the proposal). Don't take
60    that personally, though! Instead, take another critical look at what you are suggesting
61    and see if it: changes too much, is too broad, doesn't adhere to the
62    [developer notes](DEVELOPMENT_WORKFLOW.md), is dangerous or insecure, is messily written, etc.
63    Identify and address any of the issues you find. Then ask e.g. on IRC if someone could give
64    their opinion on the concept itself.
65  - It may be because your code is too complex for all but a few people, and those people
66    may not have realized your pull request even exists. A great way to find people who
67    are qualified and care about the code you are touching is the
68    [Git Blame feature](https://docs.github.com/en/github/managing-files-in-a-repository/managing-files-on-github/tracking-changes-in-a-file). Simply
69    look up who last modified the code you are changing and see if you can find
70    them and give them a nudge. Don't be incessant about the nudging, though.
71  - Finally, if all else fails, ask on IRC or elsewhere for someone to give your pull request
72    a look. If you think you've been waiting for an unreasonably long time (say,
73    more than a month) for no particular reason (a few lines changed, etc.),
74    this is totally fine. Try to return the favor when someone else is asking
75    for feedback on their code, and the universe balances out.
76  - Remember that the best thing you can do while waiting is give review to others!
77
78
79---
80File: /contrib/CONTRIBUTING.md
81---
82
83# Contributing to Bittensor Subnet Development
84
85The following is a set of guidelines for contributing to the Bittensor ecosystem. These are **HIGHLY RECOMMENDED** guidelines, but not hard-and-fast rules. Use your best judgment, and feel free to propose changes to this document in a pull request.
86
87## Table Of Contents
881. [How Can I Contribute?](#how-can-i-contribute)
89   1. [Communication Channels](#communication-channels)
90   1. [Code Contribution General Guideline](#code-contribution-general-guidelines)
91   1. [Pull Request Philosophy](#pull-request-philosophy)
92   1. [Pull Request Process](#pull-request-process)
93   1. [Addressing Feedback](#addressing-feedback)
94   1. [Squashing Commits](#squashing-commits)
95   1. [Refactoring](#refactoring)
96   1. [Peer Review](#peer-review)
97 1. [Suggesting Features](#suggesting-enhancements-and-features)
98
99
100## How Can I Contribute?
101TODO(developer): Define your desired contribution procedure.
102
103## Communication Channels
104TODO(developer): Place your communication channels here
105
106> Please follow the Bittensor Subnet [style guide](./STYLE.md) regardless of your contribution type. 
107
108Here is a high-level summary:
109- Code consistency is crucial; adhere to established programming language conventions.
110- Use `black` to format your Python code; it ensures readability and consistency.
111- Write concise Git commit messages; summarize changes in ~50 characters.
112- Follow these six commit rules:
113  - Atomic Commits: Focus on one task or fix per commit.
114  - Subject and Body Separation: Use a blank line to separate the subject from the body.
115  - Subject Line Length: Keep it under 50 characters for readability.
116  - Imperative Mood: Write subject line as if giving a command or instruction.
117  - Body Text Width: Wrap text manually at 72 characters.
118  - Body Content: Explain what changed and why, not how.
119- Make use of your commit messages to simplify project understanding and maintenance.
120
121> For clear examples of each of the commit rules, see the style guide's [rules](./STYLE.md#the-six-rules-of-a-great-commit) section.
122
123### Code Contribution General Guidelines
124
125> Review the Bittensor Subnet [style guide](./STYLE.md) and [development workflow](./DEVELOPMENT_WORKFLOW.md) before contributing. 
126
127
128#### Pull Request Philosophy
129
130Patchsets and enhancements should always be focused. A pull request could add a feature, fix a bug, or refactor code, but it should not contain a mixture of these. Please also avoid 'super' pull requests which attempt to do too much, are overly large, or overly complex as this makes review difficult. 
131
132Specifically, pull requests must adhere to the following criteria:
133- Contain fewer than 50 files. PRs with more than 50 files will be closed.
134- If a PR introduces a new feature, it *must* include corresponding tests.
135- Other PRs (bug fixes, refactoring, etc.) should ideally also have tests, as they provide proof of concept and prevent regression.
136- Categorize your PR properly by using GitHub labels. This aids in the review process by informing reviewers about the type of change at a glance.
137- Make sure your code includes adequate comments. These should explain why certain decisions were made and how your changes work.
138- If your changes are extensive, consider breaking your PR into smaller, related PRs. This makes your contributions easier to understand and review.
139- Be active in the discussion about your PR. Respond promptly to comments and questions to help reviewers understand your changes and speed up the acceptance process.
140
141Generally, all pull requests must:
142
143  - Have a clear use case, fix a demonstrable bug or serve the greater good of the project (e.g. refactoring for modularisation).
144  - Be well peer-reviewed.
145  - Follow code style guidelines.
146  - Not break the existing test suite.
147  - Where bugs are fixed, where possible, there should be unit tests demonstrating the bug and also proving the fix.
148  - Change relevant comments and documentation when behaviour of code changes.
149
150#### Pull Request Process
151
152Please follow these steps to have your contribution considered by the maintainers:
153
154*Before* creating the PR:
1551. Read the [development workflow](./DEVELOPMENT_WORKFLOW.md) defined for this repository to understand our workflow.
1562. Ensure your PR meets the criteria stated in the 'Pull Request Philosophy' section.
1573. Include relevant tests for any fixed bugs or new features as stated in the [testing guide](./TESTING.md).
1584. Ensure your commit messages are clear and concise. Include the issue number if applicable.
1595. If you have multiple commits, rebase them into a single commit using `git rebase -i`.
1606. Explain what your changes do and why you think they should be merged in the PR description consistent with the [style guide](./STYLE.md).
161
162*After* creating the PR:
1631. Verify that all [status checks](https://help.github.com/articles/about-status-checks/) are passing after you submit your pull request. 
1642. Label your PR using GitHub's labeling feature. The labels help categorize the PR and streamline the review process.
1653. Document your code with comments that provide a clear understanding of your changes. Explain any non-obvious parts of your code or design decisions you've made.
1664. If your PR has extensive changes, consider splitting it into smaller, related PRs. This reduces the cognitive load on the reviewers and speeds up the review process.
167
168Please be responsive and participate in the discussion on your PR! This aids in clarifying any confusion or concerns and leads to quicker resolution and merging of your PR.
169
170> Note: If your changes are not ready for merge but you want feedback, create a draft pull request.
171
172Following these criteria will aid in quicker review and potential merging of your PR.
173While the prerequisites above must be satisfied prior to having your pull request reviewed, the reviewer(s) may ask you to complete additional design work, tests, or other changes before your pull request can be ultimately accepted.
174
175When you are ready to submit your changes, create a pull request:
176
177> **Always** follow the [style guide](./STYLE.md) and [development workflow](./DEVELOPMENT_WORKFLOW.md) before submitting pull requests.
178
179After you submit a pull request, it will be reviewed by the maintainers. They may ask you to make changes. Please respond to any comments and push your changes as a new commit.
180
181> Note: Be sure to merge the latest from "upstream" before making a pull request:
182
183```bash
184git remote add upstream https://github.com/opentensor/bittensor.git # TODO(developer): replace with your repo URL
185git fetch upstream
186git merge upstream/<your-branch-name>
187git push origin <your-branch-name>
188```
189
190#### Addressing Feedback
191
192After submitting your pull request, expect comments and reviews from other contributors. You can add more commits to your pull request by committing them locally and pushing to your fork.
193
194You are expected to reply to any review comments before your pull request is merged. You may update the code or reject the feedback if you do not agree with it, but you should express so in a reply. If there is outstanding feedback and you are not actively working on it, your pull request may be closed.
195
196#### Squashing Commits
197
198If your pull request contains fixup commits (commits that change the same line of code repeatedly) or too fine-grained commits, you may be asked to [squash](https://git-scm.com/docs/git-rebase#_interactive_mode) your commits before it will be reviewed. The basic squashing workflow is shown below.
199
200    git checkout your_branch_name
201    git rebase -i HEAD~n
202    # n is normally the number of commits in the pull request.
203    # Set commits (except the one in the first line) from 'pick' to 'squash', save and quit.
204    # On the next screen, edit/refine commit messages.
205    # Save and quit.
206    git push -f # (force push to GitHub)
207
208Please update the resulting commit message, if needed. It should read as a coherent message. In most cases, this means not just listing the interim commits.
209
210If your change contains a merge commit, the above workflow may not work and you will need to remove the merge commit first. See the next section for details on how to rebase.
211
212Please refrain from creating several pull requests for the same change. Use the pull request that is already open (or was created earlier) to amend changes. This preserves the discussion and review that happened earlier for the respective change set.
213
214The length of time required for peer review is unpredictable and will vary from pull request to pull request.
215
216#### Refactoring
217
218Refactoring is a necessary part of any software project's evolution. The following guidelines cover refactoring pull requests for the project.
219
220There are three categories of refactoring: code-only moves, code style fixes, and code refactoring. In general, refactoring pull requests should not mix these three kinds of activities in order to make refactoring pull requests easy to review and uncontroversial. In all cases, refactoring PRs must not change the behaviour of code within the pull request (bugs must be preserved as is).
221
222Project maintainers aim for a quick turnaround on refactoring pull requests, so where possible keep them short, uncomplex and easy to verify.
223
224Pull requests that refactor the code should not be made by new contributors. It requires a certain level of experience to know where the code belongs to and to understand the full ramification (including rebase effort of open pull requests). Trivial pull requests or pull requests that refactor the code with no clear benefits may be immediately closed by the maintainers to reduce unnecessary workload on reviewing.
225
226#### Peer Review
227
228Anyone may participate in peer review which is expressed by comments in the pull request. Typically reviewers will review the code for obvious errors, as well as test out the patch set and opine on the technical merits of the patch. Project maintainers take into account the peer review when determining if there is consensus to merge a pull request (remember that discussions may have taken place elsewhere, not just on GitHub). The following language is used within pull-request comments:
229
230- ACK means "I have tested the code and I agree it should be merged";
231- NACK means "I disagree this should be merged", and must be accompanied by sound technical justification. NACKs without accompanying reasoning may be disregarded;
232- utACK means "I have not tested the code, but I have reviewed it and it looks OK, I agree it can be merged";
233- Concept ACK means "I agree in the general principle of this pull request";
234- Nit refers to trivial, often non-blocking issues.
235
236Reviewers should include the commit(s) they have reviewed in their comments. This can be done by copying the commit SHA1 hash.
237
238A pull request that changes consensus-critical code is considerably more involved than a pull request that adds a feature to the wallet, for example. Such patches must be reviewed and thoroughly tested by several reviewers who are knowledgeable about the changed subsystems. Where new features are proposed, it is helpful for reviewers to try out the patch set on a test network and indicate that they have done so in their review. Project maintainers will take this into consideration when merging changes.
239
240For a more detailed description of the review process, see the [Code Review Guidelines](CODE_REVIEW_DOCS.md).
241
242> **Note:** If you find a **Closed** issue that seems like it is the same thing that you're experiencing, open a new issue and include a link to the original issue in the body of your new one.
243
244#### How Do I Submit A (Good) Bug Report?
245
246Please track bugs as GitHub issues.
247
248Explain the problem and include additional details to help maintainers reproduce the problem:
249
250* **Use a clear and descriptive title** for the issue to identify the problem.
251* **Describe the exact steps which reproduce the problem** in as many details as possible. For example, start by explaining how you started the application, e.g. which command exactly you used in the terminal, or how you started Bittensor otherwise. When listing steps, **don't just say what you did, but explain how you did it**. For example, if you ran with a set of custom configs, explain if you used a config file or command line arguments. 
252* **Provide specific examples to demonstrate the steps**. Include links to files or GitHub projects, or copy/pasteable snippets, which you use in those examples. If you're providing snippets in the issue, use [Markdown code blocks](https://help.github.com/articles/markdown-basics/#multiple-lines).
253* **Describe the behavior you observed after following the steps** and point out what exactly is the problem with that behavior.
254* **Explain which behavior you expected to see instead and why.**
255* **Include screenshots and animated GIFs** which show you following the described steps and clearly demonstrate the problem. You can use [this tool](https://www.cockos.com/licecap/) to record GIFs on macOS and Windows, and [this tool](https://github.com/colinkeenan/silentcast) or [this tool](https://github.com/GNOME/byzanz) on Linux.
256* **If you're reporting that Bittensor crashed**, include a crash report with a stack trace from the operating system. On macOS, the crash report will be available in `Console.app` under "Diagnostic and usage information" > "User diagnostic reports". Include the crash report in the issue in a [code block](https://help.github.com/articles/markdown-basics/#multiple-lines), a [file attachment](https://help.github.com/articles/file-attachments-on-issues-and-pull-requests/), or put it in a [gist](https://gist.github.com/) and provide link to that gist.
257* **If the problem is related to performance or memory**, include a CPU profile capture with your report, if you're using a GPU then include a GPU profile capture as well. Look into the [PyTorch Profiler](https://pytorch.org/tutorials/recipes/recipes/profiler_recipe.html) to look at memory usage of your model.
258* **If the problem wasn't triggered by a specific action**, describe what you were doing before the problem happened and share more information using the guidelines below.
259
260Provide more context by answering these questions:
261
262* **Did the problem start happening recently** (e.g. after updating to a new version) or was this always a problem?
263* If the problem started happening recently, **can you reproduce the problem in an older version of Bittensor?** 
264* **Can you reliably reproduce the issue?** If not, provide details about how often the problem happens and under which conditions it normally happens.
265
266Include details about your configuration and environment:
267
268* **Which version of Bittensor Subnet are you using?**
269* **What commit hash are you on?** You can get the exact commit hash by checking `git log` and pasting the full commit hash.
270* **What's the name and version of the OS you're using**?
271* **Are you running Bittensor Subnet in a virtual machine?** If so, which VM software are you using and which operating systems and versions are used for the host and the guest?
272* **Are you running Bittensor Subnet in a dockerized container?** If so, have you made sure that your docker container contains your latest changes and is up to date with Master branch?
273
274### Suggesting Enhancements and Features
275
276This section guides you through submitting an enhancement suggestion, including completely new features and minor improvements to existing functionality. Following these guidelines helps maintainers and the community understand your suggestion :pencil: and find related suggestions :mag_right:.
277
278When you are creating an enhancement suggestion, please [include as many details as possible](#how-do-i-submit-a-good-enhancement-suggestion). Fill in [the template](https://bit.ly/atom-behavior-pr), including the steps that you imagine you would take if the feature you're requesting existed.
279
280#### Before Submitting An Enhancement Suggestion
281
282* **Check the [debugging guide](./DEBUGGING.md).** for tips — you might discover that the enhancement is already available. Most importantly, check if you're using the latest version of the project first.
283
284#### How Submit A (Good) Feature Suggestion
285
286* **Use a clear and descriptive title** for the issue to identify the problem.
287* **Provide a step-by-step description of the suggested enhancement** in as many details as possible.
288* **Provide specific examples to demonstrate the steps**. Include copy/pasteable snippets which you use in those examples, as [Markdown code blocks](https://help.github.com/articles/markdown-basics/#multiple-lines).
289* **Describe the current behavior** and **explain which behavior you expected to see instead** and why.
290* **Include screenshots and animated GIFs** which help you demonstrate the steps or point out the part of the project which the suggestion is related to. You can use [this tool](https://www.cockos.com/licecap/) to record GIFs on macOS and Windows, and [this tool](https://github.com/colinkeenan/silentcast) or [this tool](https://github.com/GNOME/byzanz) on Linux.
291* **Explain why this enhancement would be useful** to most users.
292* **List some other text editors or applications where this enhancement exists.**
293* **Specify the name and version of the OS you're using.**
294
295Thank you for considering contributing to Bittensor! Any help is greatly appreciated along this journey to incentivize open and permissionless intelligence.
296
297
298
299---
300File: /contrib/DEVELOPMENT_WORKFLOW.md
301---
302
303# Bittensor Subnet Development Workflow
304
305This is a highly advisable workflow to follow to keep your subtensor project organized and foster ease of contribution.
306
307## Table of contents
308
309- [Bittensor Subnet Development Workflow](#bittensor-subnet-development-workflow)
310  - [Main Branches](#main-branches)
311  - [Development Model](#development-model)
312      - [Feature Branches](#feature-branches)
313      - [Release Branches](#release-branches)
314      - [Hotfix Branches](#hotfix-branches)
315  - [Git Operations](#git-operations)
316      - [Creating a Feature Branch](#creating-a-feature-branch)
317      - [Merging Feature Branch into Staging](#merging-feature-branch-into-staging)
318      - [Creating a Release Branch](#creating-a-release-branch)
319      - [Finishing a Release Branch](#finishing-a-release-branch)
320      - [Creating a Hotfix Branch](#creating-a-hotfix-branch)
321      - [Finishing a Hotfix Branch](#finishing-a-hotfix-branch)
322  - [Continuous Integration (CI) and Continuous Deployment (CD)](#continuous-integration-ci-and-continuous-deployment-cd)
323  - [Versioning and Release Notes](#versioning-and-release-notes)
324  - [Pending Tasks](#pending-tasks)
325
326## Main Branches
327
328Bittensor's codebase consists of two main branches: **main** and **staging**.
329
330**main**
331- This is Bittensor's live production branch, which should only be updated by the core development team. This branch is protected, so refrain from pushing or merging into it unless authorized.
332
333**staging**
334- This branch is continuously updated and is where you propose and merge changes. It's essentially Bittensor's active development branch.
335
336## Development Model
337
338### Feature Branches
339
340- Branch off from: `staging`
341- Merge back into: `staging`
342- Naming convention: `feature/<ticket>/<descriptive-sentence>`
343
344Feature branches are used to develop new features for upcoming or future releases. They exist as long as the feature is in development, but will eventually be merged into `staging` or discarded. Always delete your feature branch after merging to avoid unnecessary clutter.
345
346### Release Branches
347
348- Branch off from: `staging`
349- Merge back into: `staging` and then `main`
350- Naming convention: `release/<version>/<descriptive-message>/<creator's-name>`
351
352Release branches support the preparation of a new production release, allowing for minor bug fixes and preparation of metadata (version number, configuration, etc). All new features should be merged into `staging` and wait for the next big release.
353
354### Hotfix Branches
355
356General workflow:
357
358- Branch off from: `main` or `staging`
359- Merge back into: `staging` then `main`
360- Naming convention: `hotfix/<version>/<descriptive-message>/<creator's-name>` 
361
362Hotfix branches are meant for quick fixes in the production environment. When a critical bug in a production version must be resolved immediately, a hotfix branch is created.
363
364## Git Operations
365
366#### Create a feature branch
367
3681. Branch from the **staging** branch.
369    1. Command: `git checkout -b feature/my-feature staging`
370
371> Rebase frequently with the updated staging branch so you do not face big conflicts before submitting your pull request. Remember, syncing your changes with other developers could also help you avoid big conflicts.
372
373#### Merge feature branch into staging
374
375In other words, integrate your changes into a branch that will be tested and prepared for release.
376
3771. Switch branch to staging: `git checkout staging`
3782. Merging feature branch into staging: `git merge --no-ff feature/my-feature`
3793. Pushing changes to staging: `git push origin staging`
3804. Delete feature branch: `git branch -d feature/my-feature` (alternatively, this can be navigated on the GitHub web UI)
381
382This operation is done by Github when merging a PR.
383
384So, what you have to keep in mind is:
385- Open the PR against the `staging` branch.
386- After merging a PR you should delete your feature branch. This will be strictly enforced.
387
388#### Creating a release branch
389
3901. Create branch from staging: `git checkout -b release/3.4.0/descriptive-message/creator's_name staging`
3912. Updating version with major or minor: `./scripts/update_version.sh major|minor`
3923. Commit file changes with new version: `git commit -a -m "Updated version to 3.4.0"`
393
394
395#### Finishing a Release Branch
396
397This involves releasing stable code and generating a new version for bittensor.
398
3991. Switch branch to main: `git checkout main`
4002. Merge release branch into main: `git merge --no-ff release/3.4.0/optional-descriptive-message`
4013. Tag changeset: `git tag -a v3.4.0 -m "Releasing v3.4.0: some comment about it"`
4024. Push changes to main: `git push origin main`
4035. Push tags to origin: `git push origin --tags`
404
405To keep the changes made in the __release__ branch, we need to merge those back into `staging`:
406
407- Switch branch to staging: `git checkout staging`.
408- Merging release branch into staging: `git merge --no-ff release/3.4.0/optional-descriptive-message`
409
410This step may well lead to a merge conflict (probably even, since we have changed the version number). If so, fix it and commit.
411
412
413#### Creating a hotfix branch
4141. Create branch from main: `git checkout -b hotfix/3.3.4/descriptive-message/creator's-name main`
4152. Update patch version: `./scripts/update_version.sh patch`
4163. Commit file changes with new version: `git commit -a -m "Updated version to 3.3.4"`
4174. Fix the bug and commit the fix: `git commit -m "Fixed critical production issue X"`
418
419#### Finishing a Hotfix Branch
420
421Finishing a hotfix branch involves merging the bugfix into both `main` and `staging`.
422
4231. Switch branch to main: `git checkout main`
4242. Merge hotfix into main: `git merge --no-ff hotfix/3.3.4/optional-descriptive-message`
4253. Tag new version: `git tag -a v3.3.4 -m "Releasing v3.3.4: descriptive comment about the hotfix"`
4264. Push changes to main: `git push origin main`
4275. Push tags to origin: `git push origin --tags`
4286. Switch branch to staging: `git checkout staging`
4297. Merge hotfix into staging: `git merge --no-ff hotfix/3.3.4/descriptive-message/creator's-name`
4308. Push changes to origin/staging: `git push origin staging`
4319. Delete hotfix branch: `git branch -d hotfix/3.3.4/optional-descriptive-message`
432
433The one exception to the rule here is that, **when a release branch currently exists, the hotfix changes need to be merged into that release branch, instead of** `staging`. Back-merging the bugfix into the __release__ branch will eventually result in the bugfix being merged into `develop` too, when the release branch is finished. (If work in develop immediately requires this bugfix and cannot wait for the release branch to be finished, you may safely merge the bugfix into develop now already as well.)
434
435Finally, we remove the temporary branch:
436
437- `git branch -d hotfix/3.3.4/optional-descriptive-message`
438## Continuous Integration (CI) and Continuous Deployment (CD)
439
440Continuous Integration (CI) is a software development practice where members of a team integrate their work frequently. Each integration is verified by an automated build and test process to detect integration errors as quickly as possible. 
441
442Continuous Deployment (CD) is a software engineering approach in which software functionalities are delivered frequently through automated deployments.
443
444- **CircleCI job**: Create jobs in CircleCI to automate the merging of staging into main and release version (needed to release code) and building and testing Bittensor (needed to merge PRs).
445
446> It is highly recommended to set up your own circleci pipeline with your subnet
447
448## Versioning and Release Notes
449
450Semantic versioning helps keep track of the different versions of the software. When code is merged into main, generate a new version. 
451
452Release notes provide documentation for each version released to the users, highlighting the new features, improvements, and bug fixes. When merged into main, generate GitHub release and release notes.
453
454## Pending Tasks
455
456Follow these steps when you are contributing to the bittensor subnet:
457
458- Determine if main and staging are different
459- Determine what is in staging that is not merged yet
460    - Document not released developments
461    - When merged into staging, generate information about what's merged into staging but not released.
462    - When merged into main, generate GitHub release and release notes.
463- CircleCI jobs 
464    - Merge staging into main and release version (needed to release code)
465    - Build and Test Bittensor (needed to merge PRs)
466
467This document can be improved as the Bittensor project continues to develop and change.
468
469
470
471---
472File: /contrib/STYLE.md
473---
474
475# Style Guide
476
477A project’s long-term success rests (among other things) on its maintainability, and a maintainer has few tools more powerful than his or her project’s log. It’s worth taking the time to learn how to care for one properly. What may be a hassle at first soon becomes habit, and eventually a source of pride and productivity for all involved.
478
479Most programming languages have well-established conventions as to what constitutes idiomatic style, i.e. naming, formatting and so on. There are variations on these conventions, of course, but most developers agree that picking one and sticking to it is far better than the chaos that ensues when everybody does their own thing.
480
481# Table of Contents
4821. [Code Style](#code-style)
4832. [Naming Conventions](#naming-conventions)
4843. [Git Commit Style](#git-commit-style)
4854. [The Six Rules of a Great Commit](#the-six-rules-of-a-great-commit)
486   - [1. Atomic Commits](#1-atomic-commits)
487   - [2. Separate Subject from Body with a Blank Line](#2-separate-subject-from-body-with-a-blank-line)
488   - [3. Limit the Subject Line to 50 Characters](#3-limit-the-subject-line-to-50-characters)
489   - [4. Use the Imperative Mood in the Subject Line](#4-use-the-imperative-mood-in-the-subject-line)
490   - [5. Wrap the Body at 72 Characters](#5-wrap-the-body-at-72-characters)
491   - [6. Use the Body to Explain What and Why vs. How](#6-use-the-body-to-explain-what-and-why-vs-how)
4925. [Tools Worth Mentioning](#tools-worth-mentioning)
493   - [Using `--fixup`](#using---fixup)
494   - [Interactive Rebase](#interactive-rebase)
4956. [Pull Request and Squashing Commits Caveats](#pull-request-and-squashing-commits-caveats)
496
497
498### Code style
499
500#### General Style
501Python's official style guide is PEP 8, which provides conventions for writing code for the main Python distribution. Here are some key points:
502
503- `Indentation:` Use 4 spaces per indentation level.
504
505- `Line Length:` Limit all lines to a maximum of 79 characters.
506
507- `Blank Lines:` Surround top-level function and class definitions with two blank lines. Method definitions inside a class are surrounded by a single blank line.
508
509- `Imports:` Imports should usually be on separate lines and should be grouped in the following order:
510
511    - Standard library imports.
512    - Related third party imports.
513    - Local application/library specific imports.
514- `Whitespace:` Avoid extraneous whitespace in the following situations:
515
516    - Immediately inside parentheses, brackets or braces.
517    - Immediately before a comma, semicolon, or colon.
518    - Immediately before the open parenthesis that starts the argument list of a function call.
519- `Comments:` Comments should be complete sentences and should be used to clarify code and are not a substitute for poorly written code.
520
521#### For Python
522
523- `List Comprehensions:` Use list comprehensions for concise and readable creation of lists.
524
525- `Generators:` Use generators when dealing with large amounts of data to save memory.
526
527- `Context Managers:` Use context managers (with statement) for resource management.
528
529- `String Formatting:` Use f-strings for formatting strings in Python 3.6 and above.
530
531- `Error Handling:` Use exceptions for error handling whenever possible.
532
533#### More details
534
535Use `black` to format your python code before commiting for consistency across such a large pool of contributors. Black's code [style](https://black.readthedocs.io/en/stable/the_black_code_style/current_style.html#code-style) ensures consistent and opinionated code formatting. It automatically formats your Python code according to the Black style guide, enhancing code readability and maintainability.
536
537Key Features of Black:
538
539    Consistency: Black enforces a single, consistent coding style across your project, eliminating style debates and allowing developers to focus on code logic.
540
541    Readability: By applying a standard formatting style, Black improves code readability, making it easier to understand and collaborate on projects.
542
543    Automation: Black automates the code formatting process, saving time and effort. It eliminates the need for manual formatting and reduces the likelihood of inconsistencies.
544
545### Naming Conventions
546
547- `Classes:` Class names should normally use the CapWords Convention.
548- `Functions and Variables:` Function names should be lowercase, with words separated by underscores as necessary to improve readability. Variable names follow the same convention as function names.
549
550- `Constants:` Constants are usually defined on a module level and written in all capital letters with underscores separating words.
551
552- `Non-public Methods and Instance Variables:` Use a single leading underscore (_). This is a weak "internal use" indicator.
553
554- `Strongly "private" methods and variables:` Use a double leading underscore (__). This triggers name mangling in Python.
555
556
557### Git commit style
558
559Here’s a model Git commit message when contributing:
560```
561Summarize changes in around 50 characters or less
562
563More detailed explanatory text, if necessary. Wrap it to about 72
564characters or so. In some contexts, the first line is treated as the
565subject of the commit and the rest of the text as the body. The
566blank line separating the summary from the body is critical (unless
567you omit the body entirely); various tools like `log`, `shortlog`
568and `rebase` can get confused if you run the two together.
569
570Explain the problem that this commit is solving. Focus on why you
571are making this change as opposed to how (the code explains that).
572Are there side effects or other unintuitive consequences of this
573change? Here's the place to explain them.
574
575Further paragraphs come after blank lines.
576
577 - Bullet points are okay, too
578
579 - Typically a hyphen or asterisk is used for the bullet, preceded
580   by a single space, with blank lines in between, but conventions
581   vary here
582
583If you use an issue tracker, put references to them at the bottom,
584like this:
585
586Resolves: #123
587See also: #456, #789
588```
589
590
591## The six rules of a great commit.
592
593#### 1. Atomic Commits
594An “atomic” change revolves around one task or one fix.
595
596Atomic Approach
597 - Commit each fix or task as a separate change
598 - Only commit when a block of work is complete
599 - Commit each layout change separately
600 - Joint commit for layout file, code behind file, and additional resources
601
602Benefits
603
604- Easy to roll back without affecting other changes
605- Easy to make other changes on the fly
606- Easy to merge features to other branches
607
608#### Avoid trivial commit messages
609
610Commit messages like "fix", "fix2", or "fix3" don't provide any context or clear understanding of what changes the commit introduces. Here are some examples of good vs. bad commit messages:
611
612**Bad Commit Message:** 
613
614    $ git commit -m "fix"
615
616**Good Commit Message:**
617
618    $ git commit -m "Fix typo in README file"
619
620> **Caveat**: When working with new features, an atomic commit will often consist of multiple files, since a layout file, code behind file, and additional resources may have been added/modified. You don’t want to commit all of these separately, because if you had to roll back the application to a state before the feature was added, it would involve multiple commit entries, and that can get confusing
621
622#### 2. Separate subject from body with a blank line
623
624Not every commit requires both a subject and a body. Sometimes a single line is fine, especially when the change is so simple that no further context is necessary. 
625
626For example:
627
628    Fix typo in introduction to user guide
629
630Nothing more need be said; if the reader wonders what the typo was, she can simply take a look at the change itself, i.e. use     git show or git diff or git log -p.
631
632If you’re committing something like this at the command line, it’s easy to use the -m option to git commit:
633
634    $ git commit -m"Fix typo in introduction to user guide"
635
636However, when a commit merits a bit of explanation and context, you need to write a body. For example:
637
638    Derezz the master control program
639
640    MCP turned out to be evil and had become intent on world domination.
641    This commit throws Tron's disc into MCP (causing its deresolution)
642    and turns it back into a chess game.
643
644Commit messages with bodies are not so easy to write with the -m option. You’re better off writing the message in a proper text editor. [See Pro Git](https://git-scm.com/book/en/v2/Customizing-Git-Git-Configuration).
645
646In any case, the separation of subject from body pays off when browsing the log. Here’s the full log entry:
647
648    $ git log
649    commit 42e769bdf4894310333942ffc5a15151222a87be
650    Author: Kevin Flynn <[email protected]>
651    Date:   Fri Jan 01 00:00:00 1982 -0200
652    
653     Derezz the master control program
654    
655     MCP turned out to be evil and had become intent on world domination.
656     This commit throws Tron's disc into MCP (causing its deresolution)
657     and turns it back into a chess game.
658
659
660#### 3. Limit the subject line to 50 characters
66150 characters is not a hard limit, just a rule of thumb. Keeping subject lines at this length ensures that they are readable, and forces the author to think for a moment about the most concise way to explain what’s going on.
662
663GitHub’s UI is fully aware of these conventions. It will warn you if you go past the 50 character limit. Git will truncate any subject line longer than 72 characters with an ellipsis, thus keeping it to 50 is best practice.
664
665#### 4. Use the imperative mood in the subject line
666Imperative mood just means “spoken or written as if giving a command or instruction”. A few examples:
667
668    Clean your room
669    Close the door
670    Take out the trash
671
672Each of the seven rules you’re reading about right now are written in the imperative (“Wrap the body at 72 characters”, etc.).
673
674The imperative can sound a little rude; that’s why we don’t often use it. But it’s perfect for Git commit subject lines. One reason for this is that Git itself uses the imperative whenever it creates a commit on your behalf.
675
676For example, the default message created when using git merge reads:
677
678    Merge branch 'myfeature'
679
680And when using git revert:
681
682    Revert "Add the thing with the stuff"
683
684    This reverts commit cc87791524aedd593cff5a74532befe7ab69ce9d.
685
686Or when clicking the “Merge” button on a GitHub pull request:
687
688    Merge pull request #123 from someuser/somebranch
689
690So when you write your commit messages in the imperative, you’re following Git’s own built-in conventions. For example:
691
692    Refactor subsystem X for readability
693    Update getting started documentation
694    Remove deprecated methods
695    Release version 1.0.0
696
697Writing this way can be a little awkward at first. We’re more used to speaking in the indicative mood, which is all about reporting facts. That’s why commit messages often end up reading like this:
698
699    Fixed bug with Y
700    Changing behavior of X
701
702And sometimes commit messages get written as a description of their contents:
703
704    More fixes for broken stuff
705    Sweet new API methods
706
707To remove any confusion, here’s a simple rule to get it right every time.
708
709**A properly formed Git commit subject line should always be able to complete the following sentence:**
710
711    If applied, this commit will <your subject line here>
712
713For example:
714
715    If applied, this commit will refactor subsystem X for readability
716    If applied, this commit will update getting started documentation
717    If applied, this commit will remove deprecated methods
718    If applied, this commit will release version 1.0.0
719    If applied, this commit will merge pull request #123 from user/branch
720
721#### 5. Wrap the body at 72 characters
722Git never wraps text automatically. When you write the body of a commit message, you must mind its right margin, and wrap text manually.
723
724The recommendation is to do this at 72 characters, so that Git has plenty of room to indent text while still keeping everything under 80 characters overall.
725
726A good text editor can help here. It’s easy to configure Vim, for example, to wrap text at 72 characters when you’re writing a Git commit.
727
728#### 6. Use the body to explain what and why vs. how
729This [commit](https://github.com/bitcoin/bitcoin/commit/eb0b56b19017ab5c16c745e6da39c53126924ed6) from Bitcoin Core is a great example of explaining what changed and why:
730
731```
732commit eb0b56b19017ab5c16c745e6da39c53126924ed6
733Author: Pieter Wuille <[email protected]>
734Date:   Fri Aug 1 22:57:55 2014 +0200
735
736   Simplify serialize.h's exception handling
737
738   Remove the 'state' and 'exceptmask' from serialize.h's stream
739   implementations, as well as related methods.
740
741   As exceptmask always included 'failbit', and setstate was always
742   called with bits = failbit, all it did was immediately raise an
743   exception. Get rid of those variables, and replace the setstate
744   with direct exception throwing (which also removes some dead
745   code).
746
747   As a result, good() is never reached after a failure (there are
748   only 2 calls, one of which is in tests), and can just be replaced
749   by !eof().
750
751   fail(), clear(n) and exceptions() are just never called. Delete
752   them.
753```
754
755Take a look at the [full diff](https://github.com/bitcoin/bitcoin/commit/eb0b56b19017ab5c16c745e6da39c53126924ed6) and just think how much time the author is saving fellow and future committers by taking the time to provide this context here and now. If he didn’t, it would probably be lost forever.
756
757In most cases, you can leave out details about how a change has been made. Code is generally self-explanatory in this regard (and if the code is so complex that it needs to be explained in prose, that’s what source comments are for). Just focus on making clear the reasons why you made the change in the first place—the way things worked before the change (and what was wrong with that), the way they work now, and why you decided to solve it the way you did.
758
759The future maintainer that thanks you may be yourself!
760
761
762
763#### Tools worth mentioning
764
765##### Using `--fixup`
766
767If you've made a commit and then realize you've missed something or made a minor mistake, you can use the `--fixup` option. 
768
769For example, suppose you've made a commit with a hash `9fceb02`. Later, you realize you've left a debug statement in your code. Instead of making a new commit titled "remove debug statement" or "fix", you can do the following:
770
771    $ git commit --fixup 9fceb02
772
773This will create a new commit to fix the issue, with a message like "fixup! The original commit message".
774
775##### Interactive Rebase
776
777Interactive rebase, or `rebase -i`, can be used to squash these fixup commits into the original commits they're fixing, which cleans up your commit history. You can use the `autosquash` option to automatically squash any commits marked as "fixup" into their target commits.
778
779For example:
780
781    $ git rebase -i --autosquash HEAD~5
782
783This command starts an interactive rebase for the last 5 commits (`HEAD~5`). Any commits marked as "fixup" will be automatically moved to squash with their target commits.
784
785The benefit of using `--fixup` and interactive rebase is that it keeps your commit history clean and readable. It groups fixes with the commits they are related to, rather than having a separate "fix" commit that might not make sense to other developers (or even to you) in the future.
786
787
788---
789
790#### Pull Request and Squashing Commits Caveats
791
792While atomic commits are great for development and for understanding the changes within the branch, the commit history can get messy when merging to the main branch. To keep a cleaner and more understandable commit history in our main branch, we encourage squashing all the commits of a PR into one when merging.
793
794This single commit should provide an overview of the changes that the PR introduced. It should follow the guidelines for atomic commits (an atomic commit is complete, self-contained, and understandable) but on the scale of the entire feature, task, or fix that the PR addresses. This approach combines the benefits of atomic commits during development with a clean commit history in our main branch.
795
796Here is how you can squash commits:
797
798```bash
799git rebase -i HEAD~n
800```
801
802where `n` is the number of commits to squash. After running the command, replace `pick` with `squash` for the commits you want to squash into the previous commit. This will combine the commits and allow you to write a new commit message.
803
804In this context, an atomic commit message could look like:
805
806```
807Add feature X
808
809This commit introduces feature X which does A, B, and C. It adds 
810new files for layout, updates the code behind the file, and introduces
811new resources. This change is important because it allows users to 
812perform task Y more efficiently. 
813
814It includes:
815- Creation of new layout file
816- Updates in the code-behind file
817- Addition of new resources
818
819Resolves: #123
820```
821
822In your PRs, remember to detail what the PR is introducing or fixing. This will be helpful for reviewers to understand the context and the reason behind the changes. 
823
824
825
826---
827File: /docs/stream_tutorial/client.py
828---
829
830import argparse
831import asyncio
832import bittensor as bt
833
834from protocol import StreamPrompting
835
836"""
837This has assumed you have:
8381. Registered your miner on the chain (finney/test)
8392. Are serving your miner on an open port (e.g. 12345)
840
841Steps:
842- Instantiate your synapse subclass with the relevant information. E.g. messages, roles, etc.
843- Instantiate your wallet and a dendrite client
844- Query the dendrite client with your synapse object
845- Iterate over the async generator to extract the yielded tokens on the server side
846"""
847
848
849async def query_synapse(my_uid, wallet_name, hotkey, network, netuid):
850    syn = StreamPrompting(
851        roles=["user"],
852        messages=[
853            "hello this is a test of a streaming response. Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua."
854        ],
855    )
856
857    # create a wallet instance with provided wallet name and hotkey
858    wallet = bt.wallet(name=wallet_name, hotkey=hotkey)
859
860    # instantiate the metagraph with provided network and netuid
861    metagraph = bt.metagraph(
862        netuid=netuid, network=network, sync=True, lite=False
863    )
864
865    # Grab the axon you're serving
866    axon = metagraph.axons[my_uid]
867
868    # Create a Dendrite instance to handle client-side communication.
869    dendrite = bt.dendrite(wallet=wallet)
870
871    async def main():
872        responses = await dendrite(
873            [axon], syn, deserialize=False, streaming=True
874        )
875
876        for resp in responses:
877            i = 0
878            async for chunk in resp:
879                i += 1
880                if i % 5 == 0:
881                    print()
882                if isinstance(chunk, list):
883                    print(chunk[0], end="", flush=True)
884                else:
885                    # last object yielded is the synapse itself with completion filled
886                    synapse = chunk
887            break
888
889    # Run the main function with asyncio
890    await main()
891
892
893if __name__ == "__main__":
894    parser = argparse.ArgumentParser(
895        description="Query a Bittensor synapse with given parameters."
896    )
897
898    # Adding arguments
899    parser.add_argument(
900        "--my_uid",
901        type=int,
902        required=True,
903        help="Your unique miner ID on the chain",
904    )
905    parser.add_argument(
906        "--netuid", type=int, required=True, help="Network Unique ID"
907    )
908    parser.add_argument(
909        "--wallet_name", type=str, default="default", help="Name of the wallet"
910    )
911    parser.add_argument(
912        "--hotkey", type=str, default="default", help="Hotkey for the wallet"
913    )
914    parser.add_argument(
915        "--network",
916        type=str,
917        default="test",
918        help='Network type, e.g., "test" or "mainnet"',
919    )
920
921    # Parse arguments
922    args = parser.parse_args()
923
924    # Running the async function with provided arguments
925    asyncio.run(
926        query_synapse(
927            args.my_uid,
928            args.wallet_name,
929            args.hotkey,
930            args.network,
931            args.netuid,
932        )
933    )
934
935
936
937---
938File: /docs/stream_tutorial/config.py
939---
940
941import bittensor as bt
942import argparse
943import os
944
945
946def check_config(cls, config: "bt.Config"):
947    bt.axon.check_config(config)
948    bt.logging.check_config(config)
949    full_path = os.path.expanduser(
950        "{}/{}/{}/{}".format(
951            config.logging.logging_dir,
952            config.wallet.get("name", bt.defaults.wallet.name),
953            config.wallet.get("hotkey", bt.defaults.wallet.hotkey),
954            config.miner.name,
955        )
956    )
957    config.miner.full_path = os.path.expanduser(full_path)
958    if not os.path.exists(config.miner.full_path):
959        os.makedirs(config.miner.full_path)
960
961
962def get_config() -> "bt.Config":
963    parser = argparse.ArgumentParser()
964    parser.add_argument(
965        "--axon.port", type=int, default=8098, help="Port to run the axon on."
966    )
967    # Subtensor network to connect to
968    parser.add_argument(
969        "--subtensor.network",
970        default="finney",
971        help="Bittensor network to connect to.",
972    )
973    # Chain endpoint to connect to
974    parser.add_argument(
975        "--subtensor.chain_endpoint",
976        default="wss://entrypoint-finney.opentensor.ai:443",
977        help="Chain endpoint to connect to.",
978    )
979    # Adds override arguments for network and netuid.
980    parser.add_argument(
981        "--netuid", type=int, default=1, help="The chain subnet uid."
982    )
983
984    parser.add_argument(
985        "--miner.root",
986        type=str,
987        help="Trials for this miner go in miner.root / (wallet_cold - wallet_hot) / miner.name ",
988        default="~/.bittensor/miners/",
989    )
990    parser.add_argument(
991        "--miner.name",
992        type=str,
993        help="Trials for this miner go in miner.root / (wallet_cold - wallet_hot) / miner.name ",
994        default="Bittensor Miner",
995    )
996
997    # Run config.
998    parser.add_argument(
999        "--miner.blocks_per_epoch",
1000        type=str,
1001        help="Blocks until the miner repulls the metagraph from the chain",
1002        default=100,
1003    )
1004
1005    # Switches.
1006    parser.add_argument(
1007        "--miner.no_serve",
1008        action="store_true",
1009        help="If True, the miner doesnt serve the axon.",
1010        default=False,
1011    )
1012    parser.add_argument(
1013        "--miner.no_start_axon",
1014        action="store_true",
1015        help="If True, the miner doesnt start the axon.",
1016        default=False,
1017    )
1018
1019    # Mocks.
1020    parser.add_argument(
1021        "--miner.mock_subtensor",
1022        action="store_true",
1023        help="If True, the miner will allow non-registered hotkeys to mine.",
1024        default=False,
1025    )
1026
1027    # Adds subtensor specific arguments i.e. --subtensor.chain_endpoint ... --subtensor.network ...
1028    bt.subtensor.add_args(parser)
1029
1030    # Adds logging specific arguments i.e. --logging.debug ..., --logging.trace .. or --logging.logging_dir ...
1031    bt.logging.add_args(parser)
1032
1033    # Adds wallet specific arguments i.e. --wallet.name ..., --wallet.hotkey ./. or --wallet.path ...
1034    bt.wallet.add_args(parser)
1035
1036    # Adds axon specific arguments i.e. --axon.port ...
1037    bt.axon.add_args(parser)
1038
1039    # Activating the parser to read any command-line inputs.
1040    # To print help message, run python3 template/miner.py --help
1041    config = bt.config(parser)
1042
1043    # Logging captures events for diagnosis or understanding miner's behavior.
1044    config.full_path = os.path.expanduser(
1045        "{}/{}/{}/netuid{}/{}".format(
1046            config.logging.logging_dir,
1047            config.wallet.name,
1048            config.wallet.hotkey,
1049            config.netuid,
1050            "miner",
1051        )
1052    )
1053    # Ensure the directory for logging exists, else create one.
1054    if not os.path.exists(config.full_path):
1055        os.makedirs(config.full_path, exist_ok=True)
1056    return config
1057
1058
1059
1060---
1061File: /docs/stream_tutorial/miner.py
1062---
1063
1064import copy
1065import time
1066import asyncio
1067import argparse
1068import threading
1069import traceback
1070from abc import ABC, abstractmethod
1071from functools import partial
1072from starlette.types import Send
1073
1074import bittensor as bt
1075from transformers import GPT2Tokenizer
1076from typing import List, Dict, Tuple, Union, Callable, Awaitable
1077
1078from protocol import StreamPrompting
1079from config import get_config, check_config
1080
1081
1082class StreamMiner(ABC):
1083    def __init__(self, config=None, axon=None, wallet=None, subtensor=None):
1084        # Setup base config from Miner.config() and merge with subclassed config.
1085        base_config = copy.deepcopy(config or get_config())
1086        self.config = self.config()
1087        self.config.merge(base_config)
1088
1089        check_config(StreamMiner, self.config)
1090        bt.logging.info(self.config)  # TODO: duplicate print?
1091
1092        self.prompt_cache: Dict[str, Tuple[str, int]] = {}
1093
1094        # Activating Bittensor's logging with the set configurations.
1095        bt.logging(config=self.config, logging_dir=self.config.full_path)
1096        bt.logging.info("Setting up bittensor objects.")
1097
1098        # Wallet holds cryptographic information, ensuring secure transactions and communication.
1099        self.wallet = wallet or bt.wallet(config=self.config)
1100        bt.logging.info(f"Wallet {self.wallet}")
1101
1102        # subtensor manages the blockchain connection, facilitating interaction with the Bittensor blockchain.
1103        self.subtensor = subtensor or bt.subtensor(config=self.config)
1104        bt.logging.info(f"Subtensor: {self.subtensor}")
1105        bt.logging.info(
1106            f"Running miner for subnet: {self.config.netuid} on network: {self.subtensor.chain_endpoint} with config:"
1107        )
1108
1109        # metagraph provides the network's current state, holding state about other participants in a subnet.
1110        self.metagraph = self.subtensor.metagraph(self.config.netuid)
1111        bt.logging.info(f"Metagraph: {self.metagraph}")
1112
1113        if self.wallet.hotkey.ss58_address not in self.metagraph.hotkeys:
1114            bt.logging.error(
1115                f"\nYour validator: {self.wallet} if not registered to chain connection: {self.subtensor} \nRun btcli register and try again. "
1116            )
1117            exit()
1118        else:
1119            # Each miner gets a unique identity (UID) in the network for differentiation.
1120            self.my_subnet_uid = self.metagraph.hotkeys.index(
1121                self.wallet.hotkey.ss58_address
1122            )
1123            bt.logging.info(f"Running miner on uid: {self.my_subnet_uid}")
1124
1125        # The axon handles request processing, allowing validators to send this process requests.
1126        self.axon = axon or bt.axon(
1127            wallet=self.wallet, port=self.config.axon.port
1128        )
1129        # Attach determiners which functions are called when servicing a request.
1130        bt.logging.info(f"Attaching forward function to axon.")
1131        print(f"Attaching forward function to axon. {self._prompt}")
1132        self.axon.attach(
1133            forward_fn=self._prompt,
1134        )
1135        bt.logging.info(f"Axon created: {self.axon}")
1136
1137        # Instantiate runners
1138        self.should_exit: bool = False
1139        self.is_running: bool = False
1140        self.thread: threading.Thread = None
1141        self.lock = asyncio.Lock()
1142        self.request_timestamps: Dict = {}
1143
1144    @abstractmethod
1145    def config(self) -> "bt.Config":
1146        ...
1147
1148    @classmethod
1149    @abstractmethod
1150    def add_args(cls, parser: argparse.ArgumentParser):
1151        ...
1152
1153    def _prompt(self, synapse: StreamPrompting) -> StreamPrompting:
1154        """
1155        A wrapper method around the `prompt` method that will be defined by the subclass.
1156
1157        This method acts as an intermediary layer to perform pre-processing before calling the
1158        actual `prompt` method implemented in the subclass. Specifically, it checks whether a
1159        prompt is in cache to avoid reprocessing recent requests. If the prompt is not in the
1160        cache, the subclass `prompt` method is called.
1161
1162        Args:
1163            synapse (StreamPrompting): The incoming request object encapsulating the details of the request.
1164
1165        Returns:
1166            StreamPrompting: The response object to be sent back in reply to the incoming request, essentially
1167            the filled synapse request object.
1168
1169        Raises:
1170            ValueError: If the prompt is found in the cache indicating it was sent recently.
1171
1172        Example:
1173            This method is not meant to be called directly but is invoked internally when a request
1174            is received, and it subsequently calls the `prompt` method of the subclass.
1175        """
1176        return self.prompt(synapse)
1177
1178    @abstractmethod
1179    def prompt(self, synapse: StreamPrompting) -> StreamPrompting:
1180        """
1181        Abstract method to handle and respond to incoming requests to the miner.
1182
1183        Subclasses should implement this method to define their custom logic for processing and
1184        responding to requests. This method is designed to be overridden, and its behavior will
1185        be dependent on the specific implementation provided in the subclass.
1186
1187        Args:
1188            synapse (StreamPrompting): The incoming request object encapsulating the details
1189                of the request. This must contain `messages` and `roles` as fields.
1190
1191        Returns:
1192            StreamPrompting: The response object that should be sent back in reply to the
1193                incoming request. This is essentially the filled synapse request object.
1194
1195        Example:
1196            class CustomMiner(Miner):
1197                def prompt(self, synapse: StreamPrompting) -> StreamPrompting:
1198                    # Custom logic to process and respond to the request.
1199                    synapse.completion = "The meaning of life is 42."
1200                    return synapse
1201        """
1202        ...
1203
1204    def run(self):
1205        """
1206        Runs the miner logic. This method starts the miner's operations, including
1207        listening for incoming requests and periodically updating the miner's knowledge
1208        of the network graph.
1209        """
1210        if not self.subtensor.is_hotkey_registered(
1211            netuid=self.config.netuid,
1212            hotkey_ss58=self.wallet.hotkey.ss58_address,
1213        ):
1214            bt.logging.error(
1215                f"Wallet: {self.wallet} is not registered on netuid {self.config.netuid}"
1216                f"Please register the hotkey using `btcli subnets register` before trying again"
1217            )
1218            exit()
1219
1220        # Serve passes the axon information to the network + netuid we are hosting on.
1221        # This will auto-update if the axon port of external ip have changed.
1222        bt.logging.info(
1223            f"Serving axon {StreamPrompting} on network: {self.config.subtensor.chain_endpoint} with netuid: {self.config.netuid}"
1224        )
1225        self.axon.serve(netuid=self.config.netuid, subtensor=self.subtensor)
1226
1227        # Start  starts the miner's axon, making it active on the network.
1228        bt.logging.info(
1229            f"Starting axon server on port: {self.config.axon.port}"
1230        )
1231        self.axon.start()
1232
1233        # --- Run until should_exit = True.
1234        self.last_epoch_block = self.subtensor.get_current_block()
1235        bt.logging.info(f"Miner starting at block: {self.last_epoch_block}")
1236
1237        # This loop maintains the miner's operations until intentionally stopped.
1238        bt.logging.info(f"Starting main loop")
1239        step = 0
1240        try:
1241            while not self.should_exit:
1242                start_epoch = time.time()
1243
1244                # --- Wait until next epoch.
1245                current_block = self.subtensor.get_current_block()
1246                while (
1247                    current_block - self.last_epoch_block
1248                    < self.config.miner.blocks_per_epoch
1249                ):
1250                    # --- Wait for next bloc.
1251                    time.sleep(1)
1252                    current_block = self.subtensor.get_current_block()
1253
1254                    # --- Check if we should exit.
1255                    if self.should_exit:
1256                        break
1257
1258                # --- Update the metagraph with the latest network state.
1259                self.last_epoch_block = self.subtensor.get_current_block()
1260
1261                metagraph = self.subtensor.metagraph(
1262                    netuid=self.config.netuid,
1263                    lite=True,
1264                    block=self.last_epoch_block,
1265                )
1266                log = (
1267                    f"Step:{step} | "
1268                    f"Block:{metagraph.block.item()} | "
1269                    f"Stake:{metagraph.S[self.my_subnet_uid]} | "
1270                    f"Rank:{metagraph.R[self.my_subnet_uid]} | "
1271                    f"Trust:{metagraph.T[self.my_subnet_uid]} | "
1272                    f"Consensus:{metagraph.C[self.my_subnet_uid] } | "
1273                    f"Incentive:{metagraph.I[self.my_subnet_uid]} | "
1274                    f"Emission:{metagraph.E[self.my_subnet_uid]}"
1275                )
1276                bt.logging.info(log)
1277
1278                step += 1
1279
1280        # If someone intentionally stops the miner, it'll safely terminate operations.
1281        except KeyboardInterrupt:
1282            self.axon.stop()
1283            bt.logging.success("Miner killed by keyboard interrupt.")
1284            exit()
1285
1286        # In case of unforeseen errors, the miner will log the error and continue operations.
1287        except Exception as e:
1288            bt.logging.error(traceback.format_exc())
1289
1290    def run_in_background_thread(self):
1291        """
1292        Starts the miner's operations in a separate background thread.
1293        This is useful for non-blocking operations.
1294        """
1295        if not self.is_running:
1296            bt.logging.debug("Starting miner in background thread.")
1297            self.should_exit = False
1298            self.thread = threading.Thread(target=self.run, daemon=True)
1299            self.thread.start()
1300            self.is_running = True
1301            bt.logging.debug("Started")
1302
1303    def stop_run_thread(self):
1304        """
1305        Stops the miner's operations that are running in the background thread.
1306        """
1307        if self.is_running:
1308            bt.logging.debug("Stopping miner in background thread.")
1309            self.should_exit = True
1310            self.thread.join(5)
1311            self.is_running = False
1312            bt.logging.debug("Stopped")
1313
1314    def __enter__(self):
1315        """
1316        Starts the miner's operations in a background thread upon entering the context.
1317        This method facilitates the use of the miner in a 'with' statement.
1318        """
1319        self.run_in_background_thread()
1320
1321    def __exit__(self, exc_type, exc_value, traceback):
1322        """
1323        Stops the miner's background operations upon exiting the context.
1324        This method facilitates the use of the miner in a 'with' statement.
1325
1326        Args:
1327            exc_type: The type of the exception that caused the context to be exited.
1328                      None if the context was exited without an exception.
1329            exc_value: The instance of the exception that caused the context to be exited.
1330                       None if the context was exited without an exception.
1331            traceback: A traceback object encoding the stack trace.
1332                       None if the context was exited without an exception.
1333        """
1334        self.stop_run_thread()
1335
1336
1337class StreamingTemplateMiner(StreamMiner):
1338    def config(self) -> "bt.Config":
1339        """
1340        Returns the configuration object specific to this miner.
1341
1342        Implement and extend this method to provide custom configurations for the miner.
1343        Currently, it sets up a basic configuration parser.
1344
1345        Returns:
1346            bt.Config: A configuration object with the miner's operational parameters.
1347        """
1348        parser = argparse.ArgumentParser(description="Streaming Miner Configs")
1349        self.add_args(parser)
1350        return bt.config(parser)
1351
1352    def add_args(cls, parser: argparse.ArgumentParser):
1353        """
1354        Adds custom arguments to the command line parser.
1355
1356        Developers can introduce additional command-line arguments specific to the miner's
1357        functionality in this method. These arguments can then be used to configure the miner's operation.
1358
1359        Args:
1360            parser (argparse.ArgumentParser):
1361                The command line argument parser to which custom arguments should be added.
1362        """
1363        pass
1364
1365    def prompt(self, synapse: StreamPrompting) -> StreamPrompting:
1366        """
1367        Generates a streaming response for the provided synapse.
1368
1369        This function serves as the main entry point for handling streaming prompts. It takes
1370        the incoming synapse which contains messages to be processed and returns a streaming
1371        response. The function uses the GPT-2 tokenizer and a simulated model to tokenize and decode
1372        the incoming message, and then sends the response back to the client token by token.
1373
1374        Args:
1375            synapse (StreamPrompting): The incoming StreamPrompting instance containing the messages to be processed.
1376
1377        Returns:
1378            StreamPrompting: The streaming response object which can be used by other functions to
1379                            stream back the response to the client.
1380
1381        Usage:
1382            This function can be extended and customized based on specific requirements of the
1383            miner. Developers can swap out the tokenizer, model, or adjust how streaming responses
1384            are generated to suit their specific applications.
1385        """
1386        bt.logging.trace("HI. PROMPT()")
1387        tokenizer = GPT2Tokenizer.from_pretrained("gpt2")
1388
1389        # Simulated function to decode token IDs into strings. In a real-world scenario,
1390        # this can be replaced with an actual model inference step.
1391        def model(ids):
1392            return (tokenizer.decode(id) for id in ids)
1393
1394        async def _prompt(text: str, send: Send):
1395            """
1396            Asynchronously processes the input text and sends back tokens as a streaming response.
1397
1398            This function takes an input text, tokenizes it using the GPT-2 tokenizer, and then
1399            uses the simulated model to decode token IDs into strings. It then sends each token
1400            back to the client as a streaming response, with a delay between tokens to simulate
1401            the effect of real-time streaming.
1402
1403            Args:
1404                text (str): The input text message to be processed.
1405                send (Send): An asynchronous function that allows sending back the streaming response.
1406
1407            Usage:
1408                This function can be adjusted based on the streaming requirements, speed of
1409                response, or the model being used. Developers can also introduce more sophisticated
1410                processing steps or modify how tokens are sent back to the client.
1411            """
1412            bt.logging.trace("HI. _PROMPT()")
1413            input_ids = tokenizer(
1414                text, return_tensors="pt"
1415            ).input_ids.squeeze()
1416            buffer = []
1417            bt.logging.debug(f"Input text: {text}")
1418            bt.logging.debug(f"Input ids: {input_ids}")
1419
1420            N = 3  # Number of tokens to send back to the client at a time
1421            for token in model(input_ids):
1422                bt.logging.trace(f"appending token: {token}")
1423                buffer.append(token)
1424                # If buffer has N tokens, send them back to the client.
1425                if len(buffer) == N:
1426                    time.sleep(0.1)
1427                    joined_buffer = "".join(buffer)
1428                    bt.logging.debug(f"sedning tokens: {joined_buffer}")
1429                    await send(
1430                        {
1431                            "type": "http.response.body",
1432                            "body": joined_buffer.encode("utf-8"),
1433                            "more_body": True,
1434                        }
1435                    )
1436                    bt.logging.debug(f"Streamed tokens: {joined_buffer}")
1437                    buffer = []  # Clear the buffer for next batch of tokens
1438
1439            # Send any remaining tokens in the buffer
1440            if buffer:
1441                joined_buffer = "".join(buffer)
1442                await send(
1443                    {
1444                        "type": "http.response.body",
1445                        "body": joined_buffer.encode("utf-8"),
1446                        "more_body": False,  # No more tokens to send
1447                    }
1448                )
1449                bt.logging.trace(f"Streamed tokens: {joined_buffer}")
1450
1451        message = synapse.messages[0]
1452        bt.logging.trace(f"message in _prompt: {message}")
1453        token_streamer = partial(_prompt, message)
1454        bt.logging.trace(f"token streamer: {token_streamer}")
1455        return synapse.create_streaming_response(token_streamer)
1456
1457
1458# This is the main function, which runs the miner.
1459if __name__ == "__main__":
1460    with StreamingTemplateMiner():
1461        while True:
1462            time.sleep(1)
1463
1464
1465
1466---
1467File: /docs/stream_tutorial/protocol.py
1468---
1469
1470import pydantic
1471import bittensor as bt
1472
1473from abc import ABC, abstractmethod
1474from typing import List, Union, Callable, Awaitable
1475from starlette.responses import StreamingResponse
1476
1477
1478class StreamPrompting(bt.StreamingSynapse):
1479    """
1480    StreamPrompting is a specialized implementation of the `StreamingSynapse` tailored for prompting functionalities within
1481    the Bittensor network. This class is intended to interact with a streaming response that contains a sequence of tokens,
1482    which represent prompts or messages in a certain scenario.
1483
1484    As a developer, when using or extending the `StreamPrompting` class, you should be primarily focused on the structure
1485    and behavior of the prompts you are working with. The class has been designed to seamlessly handle the streaming,
1486    decoding, and accumulation of tokens that represent these prompts.
1487
1488    Attributes:
1489    - `roles` (List[str]): A list of roles involved in the prompting scenario. This could represent different entities
1490                           or agents involved in the conversation or use-case. They are immutable to ensure consistent
1491                           interaction throughout the lifetime of the object.
1492
1493    - `messages` (List[str]): These represent the actual prompts or messages in the prompting scenario. They are also
1494                              immutable to ensure consistent behavior during processing.
1495
1496    - `completion` (str): Stores the processed result of the streaming tokens. As tokens are streamed, decoded, and
1497                          processed, they are accumulated in the completion attribute. This represents the "final"
1498                          product or result of the streaming process.
1499    - `required_hash_fields` (List[str]): A list of fields that are required for the hash.
1500
1501    Methods:
1502    - `process_streaming_response`: This method asynchronously processes the incoming streaming response by decoding
1503                                    the tokens and accumulating them in the `completion` attribute.
1504
1505    - `deserialize`: Converts the `completion` attribute into its desired data format, in this case, a string.
1506
1507    - `extract_response_json`: Extracts relevant JSON data from the response, useful for gaining insights on the response's
1508                               metadata or for debugging purposes.
1509
1510    Note: While you can directly use the `StreamPrompting` class, it's designed to be extensible. Thus, you can create
1511    subclasses to further customize behavior for specific prompting scenarios or requirements.
1512    """
1513
1514    roles: List[str] = pydantic.Field(
1515        ...,
1516        title="Roles",
1517        description="A list of roles in the StreamPrompting scenario. Immuatable.",
1518        allow_mutation=False,
1519    )
1520
1521    messages: List[str] = pydantic.Field(
1522        ...,
1523        title="Messages",
1524        description="A list of messages in the StreamPrompting scenario. Immutable.",
1525        allow_mutation=False,
1526    )
1527
1528    required_hash_fields: List[str] = pydantic.Field(
1529        ["messages"],
1530        title="Required Hash Fields",
1531        description="A list of required fields for the hash.",
1532        allow_mutation=False,
1533    )
1534
1535    completion: str = pydantic.Field(
1536        "",
1537        title="Completion",
1538        description="Completion status of the current StreamPrompting object. This attribute is mutable and can be updated.",
1539    )
1540
1541    async def process_streaming_response(self, response: StreamingResponse):
1542        """
1543        `process_streaming_response` is an asynchronous method designed to process the incoming streaming response from the
1544        Bittensor network. It's the heart of the StreamPrompting class, ensuring that streaming tokens, which represent
1545        prompts or messages, are decoded and appropriately managed.
1546
1547        As the streaming response is consumed, the tokens are decoded from their 'utf-8' encoded format, split based on
1548        newline characters, and concatenated into the `completion` attribute. This accumulation of decoded tokens in the
1549        `completion` attribute allows for a continuous and coherent accumulation of the streaming content.
1550
1551        Args:
1552            response: The streaming response object containing the content chunks to be processed. Each chunk in this
1553                      response is expected to be a set of tokens that can be decoded and split into individual messages or prompts.
1554        """
1555        if self.completion is None:
1556            self.completion = ""
1557        bt.logging.debug(
1558            "Processing streaming response (StreamingSynapse base class)."
1559        )
1560        async for chunk in response.content.iter_any():
1561            bt.logging.debug(f"Processing chunk: {chunk}")
1562            tokens = chunk.decode("utf-8").split("\n")
1563            for token in tokens:
1564                bt.logging.debug(f"--processing token: {token}")
1565                if token:
1566                    self.completion += token
1567            bt.logging.debug(f"yielding tokens {tokens}")
1568            yield tokens
1569
1570    def deserialize(self) -> str:
1571        """
1572        Deserializes the response by returning the completion attribute.
1573
1574        Returns:
1575            str: The completion result.
1576        """
1577        return self.completion
1578
1579    def extract_response_json(self, response: StreamingResponse) -> dict:
1580        """
1581        `extract_response_json` is a method that performs the crucial task of extracting pertinent JSON data from the given
1582        response. The method is especially useful when you need a detailed insight into the streaming response's metadata
1583        or when debugging response-related issues.
1584
1585        Beyond just extracting the JSON data, the method also processes and structures the data for easier consumption
1586        and understanding. For instance, it extracts specific headers related to dendrite and axon, offering insights
1587        about the Bittensor network's internal processes. The method ultimately returns a dictionary with a structured
1588        view of the extracted data.
1589
1590        Args:
1591            response: The response object from which to extract the JSON data. This object typically includes headers and
1592                      content which can be used to glean insights about the response.
1593
1594        Returns:
1595            dict: A structured dictionary containing:
1596                - Basic response metadata such as name, timeout, total_size, and header_size.
1597                - Dendrite and Axon related information extracted from headers.
1598                - Roles and Messages pertaining to the current StreamPrompting instance.
1599                - The accumulated completion.
1600        """
1601        headers = {
1602            k.decode("utf-8"): v.decode("utf-8")
1603            for k, v in response.__dict__["_raw_headers"]
1604        }
1605
1606        def extract_info(prefix):
1607            return {
1608                key.split("_")[-1]: value
1609                for key, value in headers.items()
1610                if key.startswith(prefix)
1611            }
1612
1613        return {
1614            "name": headers.get("name", ""),
1615            "timeout": float(headers.get("timeout", 0)),
1616            "total_size": int(headers.get("total_size", 0)),
1617            "header_size": int(headers.get("header_size", 0)),
1618            "dendrite": extract_info("bt_header_dendrite"),
1619            "axon": extract_info("bt_header_axon"),
1620            "roles": self.roles,
1621            "messages": self.messages,
1622            "completion": self.completion,
1623        }
1624
1625
1626
1627---
1628File: /docs/stream_tutorial/README.md
1629---
1630
1631# Bittensor Streaming Tutorial
1632This document is intented as a developer-friendly walkthrough of integrating streaming into your bittensor application.
1633
1634If you prefer to jump right into a complete stand-alone example, see:
1635- `miner.py`
1636- `protocol.py`
1637- `client.py`
1638
1639Start your miner:
1640```bash
1641python miner.py --netuid 8 --wallet.name default --wallet.hotkey miner --subtensor.network test --axon.port 10000 --logging.trace
1642```
1643
1644Run the client:
1645```bash
1646python client.py --netuid 8 --my_uid 1 --network test
1647```
1648
1649## Overview
1650This tutorial is designed to show you how to use the streaming API to integrate into your application. It will cover the following topics:
1651- writing your streaming protocol (inherits from bittensor.StreamingSynapse)
1652- writing your streaming server (uses your streaming protocol)
1653- writing your streaming client (uses your streaming protocol)
1654
1655### Defining your streaming protocol
1656When designing your protocol, it would be helpful to look at the bittensor.StreamingSynapse for reference. Below is a condensed snippet of the abstract methods that you will need to implement in your subclass.
1657
1658You will need to implement two methods:
1659
1660- `process_streaming_response`
1661- `extract_response_json`
1662
1663These two methods are the core of your streaming protocol. The first method process_streaming_response is called as the response is being streamed from the network. It is responsible for handling the streaming response, such as parsing and accumulating data. The second method extract_response_json is  called after the response has been processed and is responsible for retrieving structured data to be post-processed in the dendrite in bittensor core code.
1664
1665```python
1666class StreamingSynapse(bittensor.Synapse, ABC):
1667    ...
1668    class BTStreamingResponse(_StreamingResponse):
1669        ...
1670    @abstractmethod
1671    async def process_streaming_response(self, response: Response):
1672        """
1673        Abstract method that must be implemented by the subclass.
1674        This method should provide logic to handle the streaming response, such as parsing and accumulating data.
1675        It is called as the response is being streamed from the network, and should be implemented to handle the specific
1676        streaming data format and requirements of the subclass.
1677
1678        Args:
1679            response: The response object to be processed, typically containing chunks of data.
1680        """
1681        ...
1682
1683    @abstractmethod
1684    def extract_response_json(self, response: Response) -> dict:
1685        """
1686        Abstract method that must be implemented by the subclass.
1687        This method should provide logic to extract JSON data from the response, including headers and content.
1688        It is called after the response has been processed and is responsible for retrieving structured data
1689        that can be used by the application.
1690
1691        Args:
1692            response: The response object from which to extract JSON data.
1693        """
1694        ...
1695    ...
1696```
1697
1698See the full reference code at the bittensor [repo](https://github.com/opentensor/bittensor/blob/master/bittensor/stream.py).
1699
1700
1701#### Create your protocol
1702Let's walk through how to create a protocol using the bittensor.StreamingSynapse class.
1703```python
1704class MyStreamingSynapse(bt.StreamingSynapse):
1705    # define your expected data fields here as pydantic field objects
1706    # This allows you to control what information is passed along the network
1707    messages: List[str] = pydantic.Field(
1708        ..., # this ellipsis (...) indicates the object is required
1709        title="Messages", # What is the name of this field?
1710        description="A list of messages in the Prompting scenario. Immutable.",
1711        allow_mutation=False, # disallow modification of this field after creation
1712    )
1713    completion: str = pydantic.Field(
1714        "",
1715        title="Completion",
1716    )
1717    # add fields as necessary
1718    ...
1719
1720    # This method controls how your synapse is deserialized from the network
1721    # E.g. you can extract whatever information you want to receive at the final
1722    # yield in the async generator returned by the server, without receiving
1723    # the entire synapse object itself.
1724    # In this example, we just want the completion string at the end.
1725    def deserialize(self) -> str:
1726        return self.completion
1727
1728    # implement your `process_streaming_response` logic to actually yield objects to the streamer
1729    # this effectively defines the async generator that you'll recieve on the client side
1730    async def process_streaming_response(self, response: MyStreamingSynapse):
1731        # this is an example of how you might process a streaming response
1732        # iterate over the response content and yield each line
1733        async for chunk in response.content.iter_any():
1734            tokens = chunk.decode("utf-8").split("\n")
1735            yield tokens
1736    
1737    # implement `extract_response_json` to extract the JSON data from the response headers
1738    # this will be dependent on the data you are streaming and how you want to structure it
1739    # it MUST conform to the following format expected by the bittensor dendrite:
1740    """
1741        {
1742            # METADATA AND HEADERS
1743            "name": ...,
1744            "timeout": float(...),
1745            "total_size": int(...),
1746            "header_size": int(...),
1747            "dendrite": ...,
1748            "axon": ...,
1749            # YOUR FIELDS
1750            "messages": self.messages,
1751            ...
1752        }
1753    """
1754    def extract_response_json(self, response: MyStreamingSynapse) -> dict:
1755        # iterate over the response headers and extract the necessary data
1756        headers = {
1757            k.decode("utf-8"): v.decode("utf-8")
1758            for k, v in response.__dict__["_raw_headers"]
1759        }
1760        # helper function to extract data from headers
1761        def extract_info(prefix):
1762            return {
1763                key.split("_")[-1]: value
1764                for key, value in headers.items()
1765                if key.startswith(prefix)
1766            }
1767        # return the extracted data in the expected format
1768        return {
1769            "name": headers.get("name", ""),
1770            "timeout": float(headers.get("timeout", 0)),
1771            "total_size": int(headers.get("total_size", 0)),
1772            "header_size": int(headers.get("header_size", 0)),
1773            "dendrite": extract_info("bt_header_dendrite"), # dendrite info
1774            "axon": extract_info("bt_header_axon"), # axon info
1775            "messages": self.messages, # field object
1776        }
1777```
1778
1779[Here](https://github.com/opentensor/text-prompting/blob/main/prompting/protocol.py#L131) is a full example implementation of a streaming protocol based on the text-prompting network.
1780
1781Please read the docstrings provided, they can be very helpful!
1782
1783### Writing the server
1784Great! Now we have our protocol defined, let's see how to define our server.
1785This will generate the tokens to be streamed in this prompting example.
1786
1787For brevity we will not be building a full miner, but inspecting the central components.
1788```python
1789class MyStreamPromptingMiner(bt.Miner):
1790    ... # any relevant methods you'd need for your miner
1791
1792    # define your server forward here
1793    # NOTE: It is crucial that your typehints are correct and reflect your streaming protocol object
1794    # otherwise the axon will reject adding your route to the server.
1795    def forward(self, synapse: MyStreamingSynapse) -> MyStreamingSynapse:
1796        # Let's use a GPT2 tokenizer for this toy example
1797        tokenizer = GPT2Tokenizer.from_pretrained("gpt2")
1798
1799        # Simulated function to decode token IDs into strings. In a real-world scenario,
1800        # this can be replaced with an actual model inference step.
1801        def model(ids):
1802            return (tokenizer.decode(id) for id in ids)
1803
1804        # This function is called asynchronously to process the input text and send back tokens
1805        # as a streaming response. It essentially produces the async generator that will be
1806        # consumed by the client with an `async for` loop.
1807        async def _forward(text: str, send: Send):
1808            # `text` may be the input prompt to your model in a real-world scenario.
1809            # let's tokenize them into IDs for the sake of this example.
1810            input_ids = tokenizer(text, return_tensors="pt").input_ids.squeeze()
1811            
1812            # You may want to buffer your tokens before sending them back to the client.
1813            # this can be useful so we aren't flooding the client with individual tokens
1814            # and allows you more fine-grained control over how much data is sent back 
1815            # with each yield.
1816            N = 3  # Number of tokens to send back to the client at a time
1817            buffer = []
1818            # Iterate over the tokens and send the generationed tokens back to the client  
1819            # when we have sufficient (N) tokens in the buffer.       
1820            for token in model(input_ids):
1821                buffer.append(token) # Add token to buffer
1822
1823                # If buffer has N tokens, send them back to the client.
1824                if len(buffer) == N:
1825                    joined_buffer = "".join(buffer)
1826                    # Send the tokens back to the client
1827                    # This is the core of the streaming response and the format 
1828                    # is important. The `send` function is provided by the ASGI server
1829                    # and is responsible for sending the response back to the client.
1830                    # This buffer will be received by the client as a single chunk of
1831                    # data, which can then be split into individual tokens!
1832                    await send(
1833                        {
1834                            "type": "http.response.body",
1835                            "body": joined_buffer.encode("utf-8"),
1836                            "more_body": True,
1837                        }
1838                    )
1839                    buffer = []  # Clear the buffer for next batch of tokens
1840
1841        # Create a streaming response object using the `_forward` function
1842        # It is useful to wrap your _forward function in a partial function
1843        # to pass in the text argument lazily.
1844        token_streamer = partial(_forward, synapse.messages[0])
1845        # Return the streaming response object, which is an instance of the
1846        # `BTStreamingResponse` class.
1847        return synapse.create_streaming_response(token_streamer)
1848```
1849
1850#### Complete Example
1851Here is a full example for reference:
1852> This inherits from the prompting (text-prompting) miner base class.
1853> Take a look at the `prompting/baseminer/miner.py` file [here](https://github.com/opentensor/text-prompting/blob/main/prompting/baseminer/miner.py) for more details.
1854
1855```python
1856class StreamingTemplateMiner(prompting.Miner):
1857    def config(self) -> "bt.Config":
1858        """
1859        Returns the configuration object specific to this miner.
1860
1861        Implement and extend this method to provide custom configurations for the miner.
1862        Currently, it sets up a basic configuration parser.
1863
1864        Returns:
1865            bt.Config: A configuration object with the miner's operational parameters.
1866        """
1867        parser = argparse.ArgumentParser(description="Streaming Miner Configs")
1868        self.add_args(parser)
1869        return bt.config(parser)
1870
1871    def add_args(cls, parser: argparse.ArgumentParser):
1872        """
1873        Adds custom arguments to the command line parser.
1874
1875        Developers can introduce additional command-line arguments specific to the miner's
1876        functionality in this method. These arguments can then be used to configure the miner's operation.
1877
1878        Args:
1879            parser (argparse.ArgumentParser):
1880                The command line argument parser to which custom arguments should be added.
1881        """
1882        pass
1883
1884    def prompt(self, synapse: StreamPrompting) -> StreamPrompting:
1885        """
1886        Generates a streaming response for the provided synapse.
1887
1888        This function serves as the main entry point for handling streaming prompts. It takes
1889        the incoming synapse which contains messages to be processed and returns a streaming
1890        response. The function uses the GPT-2 tokenizer and a simulated model to tokenize and decode
1891        the incoming message, and then sends the response back to the client token by token.
1892
1893        Args:
1894            synapse (StreamPrompting): The incoming StreamPrompting instance containing the messages to be processed.
1895
1896        Returns:
1897            StreamPrompting: The streaming response object which can be used by other functions to
1898                            stream back the response to the client.
1899
1900        Usage:
1901            This function can be extended and customized based on specific requirements of the
1902            miner. Developers can swap out the tokenizer, model, or adjust how streaming responses
1903            are generated to suit their specific applications.
1904        """
1905        bt.logging.trace("In outer PROMPT()")
1906        tokenizer = GPT2Tokenizer.from_pretrained("gpt2")
1907
1908        # Simulated function to decode token IDs into strings. In a real-world scenario,
1909        # this can be replaced with an actual model inference step.
1910        def model(ids):
1911            return (tokenizer.decode(id) for id in ids)
1912
1913        async def _prompt(text: str, send: Send):
1914            """
1915            Asynchronously processes the input text and sends back tokens as a streaming response.
1916
1917            This function takes an input text, tokenizes it using the GPT-2 tokenizer, and then
1918            uses the simulated model to decode token IDs into strings. It then sends each token
1919            back to the client as a streaming response, with a delay between tokens to simulate
1920            the effect of real-time streaming.
1921
1922            Args:
1923                text (str): The input text message to be processed.
1924                send (Send): An asynchronous function that allows sending back the streaming response.
1925
1926            Usage:
1927                This function can be adjusted based on the streaming requirements, speed of
1928                response, or the model being used. Developers can also introduce more sophisticated
1929                processing steps or modify how tokens are sent back to the client.
1930            """
1931            bt.logging.trace("In inner _PROMPT()")
1932            input_ids = tokenizer(text, return_tensors="pt").input_ids.squeeze()
1933            buffer = []
1934            bt.logging.debug(f"Input text: {text}")
1935            bt.logging.debug(f"Input ids: {input_ids}")
1936             
1937            N = 3  # Number of tokens to send back to the client at a time
1938            for token in model(input_ids):
1939                bt.logging.trace(f"appending token: {token}")
1940                buffer.append(token)
1941                # If buffer has N tokens, send them back to the client.
1942                if len(buffer) == N:
1943                    time.sleep(0.1)
1944                    joined_buffer = "".join(buffer)
1945                    bt.logging.debug(f"sedning tokens: {joined_buffer}")
1946                    await send(
1947                        {
1948                            "type": "http.response.body",
1949                            "body": joined_buffer.encode("utf-8"),
1950                            "more_body": True,
1951                        }
1952                    )
1953                    bt.logging.debug(f"Streamed tokens: {joined_buffer}")
1954                    buffer = []  # Clear the buffer for next batch of tokens
1955
1956            # Send any remaining tokens in the buffer
1957            if buffer:
1958                joined_buffer = "".join(buffer)
1959                await send(
1960                    {
1961                        "type": "http.response.body",
1962                        "body": joined_buffer.encode("utf-8"),
1963                        "more_body": False,  # No more tokens to send
1964                    }
1965                )
1966                bt.logging.trace(f"Streamed tokens: {joined_buffer}")
1967
1968        message = synapse.messages[0]
1969        bt.logging.trace(f"message in _prompt: {message}")
1970        token_streamer = partial(_prompt, message)
1971        bt.logging.trace(f"token streamer: {token_streamer}")
1972        return synapse.create_streaming_response(token_streamer)
1973```
1974
1975### Writing the client
1976Excellent! Now we have defined our server, now we can define our client.
1977
1978This has assumed you have:
19791. Registered your miner on the chain (`finney`/`test`)
19802. Are serving your miner on an open port (e.g. `12345`)
1981
1982Steps:
1983- Instantiate your synapse subclass with the relevant information. E.g. `messages`, `roles`, etc.
1984- Instantiate your wallet and a dendrite client
1985- Query the dendrite client with your synapse object
1986- Iterate over the async generator to extract the yielded tokens on the server side
1987
1988```python
1989
1990# Import bittensor
1991import bittensor as bt
1992
1993# Create your streaming synapse subclass object to house the request body
1994syn = MyStreamingSynapse(
1995    roles=["user"],
1996    messages=["hello this is a test of a streaming response. Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum."]
1997)
1998
1999# Create a wallet instance that must be registered on the network
2000wallet = bt.wallet(name="default", hotkey="default")
2001
2002# Instantiate the metagraph
2003metagraph = bt.metagraph(
2004    netuid=8, network="test", sync=True, lite=False
2005)
2006
2007# Grab the axon you're serving
2008my_uid = 1
2009axon = metagraph.axons[my_uid]
2010
2011# Create a Dendrite instance to handle client-side communication.
2012dendrite = bt.dendrite(wallet=wallet)
2013
2014
2015This is an async function so we can use the `await` keyword when querying the server with the dendrite object.
2016async def main():
2017    # Send a request to the Axon using the Dendrite, passing in a StreamPrompting 
2018    # instance with roles and messages. The response is awaited, as the Dendrite 
2019    # communicates asynchronously with the Axon. Returns a list of async generator.
2020    responses = await dendrite(
2021        [axon],
2022        syn,
2023        deserialize=False,
2024        streaming=True
2025    )
2026
2027    # Now that we have our responses we want to iterate over the yielded tokens
2028    # iterate over the async generator to extract the yielded tokens on server side
2029    for resp in responses:
2030        i=0
2031        async for chunk in resp:
2032            i += 1
2033            if i % 5 == 0:
2034                print()
2035            if isinstance(chunk, list):
2036                print(chunk[0], end="", flush=True)
2037            else:
2038                # last object yielded is the synapse itself with completion filled
2039                synapse = chunk
2040        break
2041
2042    # The synapse object contains the completion attribute which contains the
2043    # accumulated tokens from the streaming response.
2044
2045if __name__ == "__main__":
2046    # Run the main function with asyncio
2047    asyncio.run(main())
2048    
2049```
2050There you have it!
2051
2052### Complete example
2053If you would like to see a complete standalone example that only depends on bittensor>=6.2.0, look below:
2054
2055- client.py
2056- streaming_miner.py
2057- 
2058
2059# client.py
2060```python
2061# Import bittensor and the text-prompting packages
2062import bittensor as bt
2063import prompting
2064
2065# Create a StreamPrompting synapse object to house the request body
2066syn = prompting.protocol.StreamPrompting(
2067    roles=["user"], 
2068    messages=["hello this is a test of a streaming response. Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum."])
2069syn
2070
2071# create a wallet instance that must be registered on the network
2072wallet = bt.wallet(name="default", hotkey="default")
2073wallet
2074
2075# instantiate the metagraph
2076metagraph = bt.metagraph(
2077    netuid=8, network="test", sync=True, lite=False
2078)
2079metagraph
2080
2081# Grab the axon you're serving
2082axon = metagraph.axons[62]
2083axon
2084
2085# Create a Dendrite instance to handle client-side communication.
2086d = bt.dendrite(wallet=wallet)
2087d
2088
2089
2090async def main():
2091        
2092    # Send a request to the Axon using the Dendrite, passing in a StreamPrompting 
2093    # instance with roles and messages. The response is awaited, as the Dendrite 
2094    # communicates asynchronously with the Axon. Returns a list of async generator.
2095    responses = await d(
2096        [axon],
2097        syn,
2098        deserialize=False,
2099        streaming=True
2100    )
2101    responses 
2102
2103    # iterate over the async generator to extract the yielded tokens on server side
2104    for resp in responses:
2105        i=0
2106        async for chunk in resp:
2107            i += 1
2108            if i % 5 == 0:
2109                print()
2110            if isinstance(chunk, list):
2111                print(chunk[0], end="", flush=True)
2112            else:
2113                # last object yielded is the synapse itself with completion filled
2114                synapse = chunk
2115        break
2116
2117if __name__ == "__main__":
2118    import asyncio
2119    asyncio.run(main())
2120```
2121
2122
2123
2124---
2125File: /docs/running_on_mainnet.md
2126---
2127
2128# Running Subnet on Mainnet
2129
2130This tutorial shows how to use the bittensor `btcli` to create a subnetwork and connect your incentive mechanism to it. 
2131
2132**IMPORTANT:** Before attempting to register on mainnet, we strongly recommend that you:
2133- First run [Running Subnet Locally](running_on_staging.md), and
2134- Then run [Running on the Testnet](running_on_testnet.md).
2135
2136Your incentive mechanisms running on the mainnet are open to anyone. They emit real TAO. Creating these mechanisms incur a `lock_cost` in TAO.
2137
2138**DANGER**
2139- Do not expose your private keys.
2140- Only use your testnet wallet.
2141- Do not reuse the password of your mainnet wallet.
2142- Make sure your incentive mechanism is resistant to abuse. 
2143
2144## Prerequisites
2145
2146Before proceeding further, make sure that you have installed Bittensor. See the below instructions:
2147
2148- [Install `bittensor`](https://github.com/opentensor/bittensor#install).
2149
2150After installing `bittensor`, proceed as below:
2151
2152## Steps
2153
2154## 1. Install your subnet template
2155
2156**NOTE: Skip this step if** you already did this during local testing and development.
2157
2158In your project directory:
2159
2160```bash
2161git clone https://github.com/opentensor/bittensor-subnet-template.git 
2162```
2163
2164Next, `cd` into `bittensor-subnet-template` repo directory:
2165
2166```bash
2167cd bittensor-subnet-template
2168```
2169
2170Install the Bittensor subnet template package:
2171
2172```bash
2173python -m pip install -e . # Install your subnet template package
2174```
2175
2176## 2. Create wallets 
2177
2178Create wallets for subnet owner, subnet validator and for subnet miner.
2179  
2180This step creates local coldkey and hotkey pairs for your three identities: subnet owner, subnet validator and subnet miner. 
2181
2182The owner will create and control the subnet. The owner must have at least 100  TAO before the owner can run next steps. 
2183
2184The validator and miner will be registered to the subnet created by the owner. This ensures that the validator and miner can run the respective validator and miner scripts.
2185
2186**NOTE**: You can also use existing wallets to register. Creating new keys is shown here for reference.
2187
2188Create a coldkey for the owner wallet:
2189
2190```bash
2191btcli wallet new_coldkey --wallet.name owner
2192```
2193
2194Create a coldkey and hotkey for the subnet miner wallet:
2195```bash
2196btcli wallet new_coldkey --wallet.name miner
2197```
2198
2199and
2200
2201```bash
2202btcli wallet new_hotkey --wallet.name miner --wallet.hotkey default
2203```
2204
2205Create a coldkey and hotkey for the subnet validator wallet:
2206
2207```bash
2208btcli wallet new_coldkey --wallet.name validator
2209```
2210
2211and
2212
2213```bash
2214btcli wallet new_hotkey --wallet.name validator --wallet.hotkey default
2215```
2216
2217## 3. Getting the price of subnet creation
2218
2219Creating subnets on mainnet is competitive. The cost is determined by the rate at which new subnets are being registered onto the Bittensor blockchain. 
2220
2221By default you must have at least 100 TAO on your owner wallet to create a subnet. However, the exact amount will fluctuate based on demand. The below code shows how to get the current price of creating a subnet.
2222
2223```bash
2224btcli subnet lock_cost 
2225```
2226
2227The above command will show:
2228
2229```bash
2230>> Subnet lock cost: τ100.000000000
2231```
2232
2233## 4. Purchasing a slot
2234
2235Using your TAO balance, you can register your subnet to the mainchain. This will create a new subnet on the mainchain and give you the owner permissions to it. The below command shows how to purchase a slot. 
2236
2237**NOTE**: Slots cost TAO to lock. You will get this TAO back when the subnet is deregistered.
2238
2239```bash
2240btcli subnet create  
2241```
2242
2243Enter the owner wallet name. This gives permissions to the coldkey.
2244
2245```bash
2246>> Enter wallet name (default): owner # Enter your owner wallet name
2247>> Enter password to unlock key: # Enter your wallet password.
2248>> Register subnet? [y/n]: <y/n> # Select yes (y)
2249>> ⠇ 📡 Registering subnet...
2250✅ Registered subnetwork with netuid: 1 # Your subnet netuid will show here, save this for later.
2251```
2252
2253## 5. (Optional) Register keys 
2254
2255**NOTE**: While this is not enforced, we recommend subnet owners to run a subnet validator and a subnet miner on the subnet to demonstrate proper use to the community.
2256
2257This step registers your subnet validator and subnet miner keys to the subnet giving them the **first two slots** on the subnet.
2258
2259Register your miner key to the subnet:
2260
2261```bash
2262btcli subnet recycle_register --netuid 1 --subtensor.network finney --wallet.name miner --wallet.hotkey default
2263```
2264
2265Follow the below prompts:
2266
2267```bash
2268>> Enter netuid [1] (1): # Enter netuid 1 to specify the subnet you just created.
2269>> Continue Registration?
2270  hotkey:     ...
2271  coldkey:    ...
2272  network:    finney [y/n]: # Select yes (y)
2273>> ✅ Registered
2274```
2275
2276Next, register your validator key to the subnet:
2277
2278```bash
2279btcli subnet recycle_register --netuid 1 --subtensor.network finney --wallet.name validator --wallet.hotkey default
2280```
2281
2282Follow the below prompts:
2283
2284```bash
2285>> Enter netuid [1] (1): # Enter netuid 1 to specify the subnet you just created.
2286>> Continue Registration?
2287  hotkey:     ...
2288  coldkey:    ...
2289  network:    finney [y/n]: # Select yes (y)
2290>> ✅ Registered
2291```
2292
2293## 6. Check that your keys have been registered
2294
2295Check that your subnet validator key has been registered:
2296
2297```bash
2298btcli wallet overview --wallet.name validator 
2299```
2300
2301The output will be similar to the below:
2302
2303```bash
2304Subnet: 1                                                                                                                                                                
2305COLDKEY  HOTKEY   UID  ACTIVE  STAKE(τ)     RANK    TRUST  CONSENSUS  INCENTIVE  DIVIDENDS  EMISSION(ρ)   VTRUST  VPERMIT  UPDATED  AXON  HOTKEY_SS58                    
2306miner    default  0      True   0.00000  0.00000  0.00000    0.00000    0.00000    0.00000            0  0.00000                14  none  5GTFrsEQfvTsh3WjiEVFeKzFTc2xcf…
23071        1        2            τ0.00000  0.00000  0.00000    0.00000    0.00000    0.00000           ρ0  0.00000                                                         
2308                                                                          Wallet balance: τ0.0         
2309```
2310
2311Check that your subnet miner has been registered:
2312
2313```bash
2314btcli wallet overview --wallet.name miner 
2315```
2316
2317The output will be similar to the below:
2318
2319```bash
2320Subnet: 1                                                                                                                                                                
2321COLDKEY  HOTKEY   UID  ACTIVE  STAKE(τ)     RANK    TRUST  CONSENSUS  INCENTIVE  DIVIDENDS  EMISSION(ρ)   VTRUST  VPERMIT  UPDATED  AXON  HOTKEY_SS58                    
2322miner    default  1      True   0.00000  0.00000  0.00000    0.00000    0.00000    0.00000            0  0.00000                14  none  5GTFrsEQfvTsh3WjiEVFeKzFTc2xcf…
23231        1        2            τ0.00000  0.00000  0.00000    0.00000    0.00000    0.00000           ρ0  0.00000                                                         
2324                                                                          Wallet balance: τ0.0   
2325```
2326
2327## 7. Run subnet miner and subnet validator
2328
2329Run the subnet miner:
2330
2331```bash
2332python neurons/miner.py --netuid 1  --wallet.name miner --wallet.hotkey default --logging.debug
2333```
2334
2335You will see the below terminal output:
2336
2337```bash
2338>> 2023-08-08 16:58:11.223 |       INFO       | Running miner for subnet: 1 on network: wss://entrypoint-finney.opentensor.ai:443 with config: ...
2339```
2340
2341Run the subnet validator:
2342
2343```bash
2344python neurons/validator.py --netuid 1  --wallet.name validator --wallet.hotkey default --logging.debug
2345```
2346
2347You will see the below terminal output:
2348
2349```bash
2350>> 2023-08-08 16:58:11.223 |       INFO       | Running validator for subnet: 1 on network: wss://entrypoint-finney.opentensor.ai:443 with config: ...
2351```
2352
2353## 8. Get emissions flowing
2354
2355Register to the root subnet using the `btcli`:
2356
2357```bash
2358btcli root register 
2359```
2360
2361Then set your weights for the subnet:
2362
2363```bash
2364btcli root weights 
2365```
2366
2367## 9. Stopping your nodes
2368
2369To stop your nodes, press CTRL + C in the terminal where the nodes are running.
2370
2371---
2372
2373
2374---
2375File: /docs/running_on_staging.md
2376---
2377
2378# Running Subnet Locally
2379
2380This tutorial will guide you through:
2381
2382- Setting up a local blockchain that is not connected to either Bittensor testchain or mainchain
2383- Creating a subnet
2384- Run your incentive mechanism on the subnet.
2385
2386## Local blockchain vs local subtensor node 
2387
2388Running a local blockchain is sometimes synonymously referred as running on staging. This is **different** from running a local subtensor node that connects to the Bittensor mainchain. 
2389
2390A local subtensor node will connect to the mainchain and sync with the mainchain, giving you your own access point to the mainchain. 
2391
2392Running a local blockchain spins up two authority nodes locally, not connected to any other nodes or testchain or mainchain. This tutorial is for running a local blockchain. 
2393
2394## Prerequisites
2395
2396Before proceeding further, make sure that you have installed Bittensor. See the below instructions:
2397
2398- [Install `bittensor`](https://github.com/opentensor/bittensor#install).
2399
2400After installing `bittensor`, proceed as below:
2401
2402## 1. Install Substrate dependencies
2403
2404Begin by installing the required dependencies for running a Substrate node.
2405
2406Update your system packages:
2407
2408```bash
2409sudo apt update 
2410```
2411
2412Install additional required libraries and tools
2413
2414```bash
2415sudo apt install --assume-yes make build-essential git clang curl libssl-dev llvm libudev-dev protobuf-compiler
2416```
2417
2418## 2. Install Rust and Cargo
2419
2420Rust is the programming language used in Substrate development. Cargo is Rust package manager.
2421
2422Install rust and cargo:
2423
2424```bash
2425curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
2426```
2427
2428Update your shell's source to include Cargo's path:
2429
2430```bash
2431source "$HOME/.cargo/env"
2432```
2433
2434## 3. Clone the subtensor repository
2435
2436This step fetches the subtensor codebase to your local machine.
2437
2438```bash
2439git clone https://github.com/opentensor/subtensor.git
2440```
2441
2442## 4. Setup Rust
2443
2444This step ensures that you have the nightly toolchain and the WebAssembly (wasm) compilation target. Note that this step will run the subtensor chain on your terminal directly, hence we advise that you run this as a background process using PM2 or other software.
2445
2446Update to the nightly version of Rust:
2447
2448```bash
2449./subtensor/scripts/init.sh
2450```
2451
2452## 5. Initialize 
2453
2454These steps initialize your local subtensor chain in development mode. These commands will set up and run a local subtensor.
2455
2456Build the binary with the faucet feature enabled:
2457
2458```bash
2459cargo build --release --features pow-faucet
2460```
2461
2462**NOTE**: The `--features pow-faucet` option in the above is required if we want to use the command `btcli wallet faucet` [See the below Mint tokens step](#8-mint-tokens-from-faucet).
2463
2464Next, run the localnet script and turn off the attempt to build the binary (as we have already done this above):
2465
2466```bash
2467BUILD_BINARY=0 ./scripts/localnet.sh 
2468```
2469
2470**NOTE**: Watch for any build or initialization outputs in this step. If you are building the project for the first time, this step will take a while to finish building, depending on your hardware.
2471
2472## 6. Install subnet template
2473
2474`cd` to your project directory and clone the bittensor subnet template repository:
2475
2476```bash
2477git clone https://github.com/opentensor/bittensor-subnet-template.git
2478```
2479
2480Navigate to the cloned repository:
2481
2482```bash
2483cd bittensor-subnet-template
2484```
2485
2486Install the bittensor-subnet-template Python package:
2487
2488```bash
2489python -m pip install -e .
2490```
2491
2492## 7. Set up wallets
2493
2494You will need wallets for the different roles, i.e., subnet owner, subnet validator and subnet miner, in the subnet. 
2495
2496- The owner wallet creates and controls the subnet. 
2497- The validator and miner will be registered to the subnet created by the owner. This ensures that the validator and miner can run the respective validator and miner scripts.
2498
2499Create a coldkey for the owner role:
2500
2501```bash
2502btcli wallet new_coldkey --wallet.name owner
2503```
2504
2505Set up the miner's wallets:
2506
2507```bash
2508btcli wallet new_coldkey --wallet.name miner
2509```
2510
2511```bash
2512btcli wallet new_hotkey --wallet.name miner --wallet.hotkey default
2513```
2514
2515Set up the validator's wallets:
2516
2517```bash
2518btcli wallet new_coldkey --wallet.name validator
2519```
2520```bash
2521btcli wallet new_hotkey --wallet.name validator --wallet.hotkey default
2522```
2523
2524## 8. Mint tokens from faucet
2525
2526You will need tokens to initialize the intentive mechanism on the chain as well as for registering the subnet. 
2527
2528Run the following commands to mint faucet tokens for the owner and for the validator.
2529
2530Mint faucet tokens for the owner:
2531
2532```bash
2533btcli wallet faucet --wallet.name owner --subtensor.chain_endpoint ws://127.0.0.1:9946 
2534```
2535
2536You will see:
2537
2538```bash
2539>> Balance: τ0.000000000 ➡ τ100.000000000
2540```
2541
2542Mint tokens for the validator:
2543
2544```bash
2545btcli wallet faucet --wallet.name validator --subtensor.chain_endpoint ws://127.0.0.1:9946 
2546```
2547
2548You will see:
2549
2550```bash
2551>> Balance: τ0.000000000 ➡ τ100.000000000
2552```
2553
2554## 9. Create a subnet
2555
2556The below commands establish a new subnet on the local chain. The cost will be exactly τ1000.000000000 for the first subnet you create and you'll have to run the faucet several times to get enough tokens.
2557
2558```bash
2559btcli subnet create --wallet.name owner --subtensor.chain_endpoint ws://127.0.0.1:9946 
2560```
2561
2562You will see:
2563
2564```bash
2565>> Your balance is: τ200.000000000
2566>> Do you want to register a subnet for τ1000.000000000? [y/n]: 
2567>> Enter password to unlock key: [YOUR_PASSWORD]
2568>> ✅ Registered subnetwork with netuid: 1
2569```
2570
2571**NOTE**: The local chain will now have a default `netuid` of 1. The second registration will create a `netuid` 2 and so on, until you reach the subnet limit of 8. If you register more than 8 subnets, then a subnet with the least staked TAO will be replaced by the 9th subnet you register.
2572
2573## 10. Register keys
2574
2575Register your subnet validator and subnet miner on the subnet. This gives your two keys unique slots on the subnet. The subnet has a current limit of 128 slots.
2576
2577Register the subnet miner:
2578
2579```bash
2580btcli subnet register --wallet.name miner --wallet.hotkey default --subtensor.chain_endpoint ws://127.0.0.1:9946
2581```
2582
2583Follow the below prompts:
2584
2585```bash
2586>> Enter netuid [1] (1): 1
2587>> Continue Registration? [y/n]: y
2588>> ✅ Registered
2589```
2590
2591Register the subnet validator:
2592
2593```bash
2594
2595btcli subnet register --wallet.name validator --wallet.hotkey default --subtensor.chain_endpoint ws://127.0.0.1:9946
2596```
2597
2598Follow the below prompts:
2599
2600```
2601>> Enter netuid [1] (1): 1
2602>> Continue Registration? [y/n]: y
2603>> ✅ Registered
2604```
2605
2606## 11. Add stake 
2607
2608This step bootstraps the incentives on your new subnet by adding stake into its incentive mechanism.
2609
2610```bash
2611btcli stake add --wallet.name validator --wallet.hotkey default --subtensor.chain_endpoint ws://127.0.0.1:9946
2612```
2613
2614Follow the below prompts:
2615
2616```bash
2617>> Stake all Tao from account: 'validator'? [y/n]: y
2618>> Stake:
2619    τ0.000000000 ➡ τ100.000000000
2620```
2621
2622## 12. Validate key registrations
2623
2624Verify that both the miner and validator keys are successfully registered:
2625
2626```bash
2627btcli subnet list --subtensor.chain_endpoint ws://127.0.0.1:9946
2628```
2629
2630You will see the `2` entry under `NEURONS` column for the `NETUID` of 1, indicating that you have registered a validator and a miner in this subnet:
2631
2632```bash
2633NETUID  NEURONS  MAX_N   DIFFICULTY  TEMPO  CON_REQ  EMISSION  BURN(τ)  
2634   1        2     256.00   10.00 M    1000    None     0.00%    τ1.00000 
2635   2      128    
2636```
2637
2638See the subnet validator's registered details:
2639
2640```bash
2641btcli wallet overview --wallet.name validator --subtensor.chain_endpoint ws://127.0.0.1:9946
2642```
2643
2644You will see:
2645
2646```
2647Subnet: 1                                                                                                                                                                
2648COLDKEY  HOTKEY   UID  ACTIVE  STAKE(τ)     RANK    TRUST  CONSENSUS  INCENTIVE  DIVIDENDS  EMISSION(ρ)   VTRUST  VPERMIT  UPDATED  AXON  HOTKEY_SS58                    
2649miner    default  0      True   100.00000  0.00000  0.00000    0.00000    0.00000    0.00000            0  0.00000                14  none  5GTFrsEQfvTsh3WjiEVFeKzFTc2xcf…
26501        1        2            τ100.00000  0.00000  0.00000    0.00000    0.00000    0.00000           ρ0  0.00000                                                         
2651                                                                          Wallet balance: τ0.0         
2652```
2653
2654See the subnet miner's registered details:
2655
2656```bash
2657btcli wallet overview --wallet.name miner --subtensor.chain_endpoint ws://127.0.0.1:9946
2658```
2659
2660You will see:
2661
2662```bash
2663Subnet: 1                                                                                                                                                                
2664COLDKEY  HOTKEY   UID  ACTIVE  STAKE(τ)     RANK    TRUST  CONSENSUS  INCENTIVE  DIVIDENDS  EMISSION(ρ)   VTRUST  VPERMIT  UPDATED  AXON  HOTKEY_SS58                    
2665miner    default  1      True   0.00000  0.00000  0.00000    0.00000    0.00000    0.00000            0  0.00000                14  none  5GTFrsEQfvTsh3WjiEVFeKzFTc2xcf…
26661        1        2            τ0.00000  0.00000  0.00000    0.00000    0.00000    0.00000           ρ0  0.00000                                                         
2667                                                                          Wallet balance: τ0.0   
2668
2669```
2670
2671## 13. Run subnet miner and subnet validator
2672
2673Run the subnet miner and subnet validator. Make sure to specify your subnet parameters.
2674
2675Run the subnet miner:
2676
2677```bash
2678python neurons/miner.py --netuid 1 --subtensor.chain_endpoint ws://127.0.0.1:9946 --wallet.name miner --wallet.hotkey default --logging.debug
2679```
2680
2681Run the subnet validator:
2682
2683```bash
2684python neurons/validator.py --netuid 1 --subtensor.chain_endpoint ws://127.0.0.1:9946 --wallet.name validator --wallet.hotkey default --logging.debug
2685```
2686
2687## 14. Set weights for your subnet
2688
2689Register a validator on the root subnet and boost to set weights for your subnet. This is a necessary step to ensure that the subnet is able to receive emmissions.
2690
2691### Register your validator on the root subnet
2692
2693```bash
2694btcli root register --wallet.name validator --wallet.hotkey default --subtensor.chain_endpoint ws://127.0.0.1:9946
2695```
2696
2697### Boost your subnet on the root subnet
2698```bash
2699btcli root boost --netuid 1 --increase 1 --wallet.name validator --wallet.hotkey default --subtensor.chain_endpoint ws://127.0.0.1:9946
2700```
2701
2702## 15. Verify your incentive mechanism
2703
2704After a few blocks the subnet validator will set weights. This indicates that the incentive mechanism is active. Then after a subnet tempo elapses (360 blocks or 72 minutes) you will see your incentive mechanism beginning to distribute TAO to the subnet miner.
2705
2706```bash
2707btcli wallet overview --wallet.name miner --subtensor.chain_endpoint ws://127.0.0.1:9946
2708```
2709
2710## Ending your session
2711
2712To halt your nodes:
2713```bash
2714# Press CTRL + C keys in the terminal.
2715```
2716
2717---
2718
2719
2720
2721---
2722File: /docs/running_on_testnet.md
2723---
2724
2725# Running Subnet on Testnet
2726
2727This tutorial shows how to use the Bittensor testnet to create a subnet and run your incentive mechanism on it. 
2728
2729**IMPORTANT:** We strongly recommend that you first run [Running Subnet Locally](running_on_staging.md) before running on the testnet. Incentive mechanisms running on the testnet are open to anyone, and although these mechanisms on testnet do not emit real TAO, they cost you test TAO which you must create. 
2730
2731**DANGER**
2732- Do not expose your private keys.
2733- Only use your testnet wallet.
2734- Do not reuse the password of your mainnet wallet.
2735- Make sure your incentive mechanism is resistant to abuse. 
2736
2737## Prerequisites
2738
2739Before proceeding further, make sure that you have installed Bittensor. See the below instructions:
2740
2741- [Install `bittensor`](https://github.com/opentensor/bittensor#install).
2742
2743After installing `bittensor`, proceed as below:
2744
2745## 1. Install Bittensor subnet template
2746
2747**NOTE: Skip this step if** you already did this during local testing and development.
2748
2749`cd` into your project directory and clone the bittensor-subnet-template repo:
2750
2751```bash
2752git clone https://github.com/opentensor/bittensor-subnet-template.git 
2753```
2754
2755Next, `cd` into bittensor-subnet-template repo directory:
2756
2757```bash
2758cd bittensor-subnet-template # Enter the 
2759```
2760
2761Install the bittensor-subnet-template package:
2762
2763```bash
2764python -m pip install -e . 
2765```
2766
2767## 2. Create wallets 
2768
2769Create wallets for subnet owner, subnet validator and for subnet miner.
2770  
2771This step creates local coldkey and hotkey pairs for your three identities: subnet owner, subnet validator and subnet miner. 
2772
2773The owner will create and control the subnet. The owner must have at least 100 testnet TAO before the owner can run next steps. 
2774
2775The validator and miner will be registered to the subnet created by the owner. This ensures that the validator and miner can run the respective validator and miner scripts.
2776
2777Create a coldkey for your owner wallet:
2778
2779```bash
2780btcli wallet new_coldkey --wallet.name owner
2781```
2782
2783Create a coldkey and hotkey for your miner wallet:
2784
2785```bash
2786btcli wallet new_coldkey --wallet.name miner
2787```
2788
2789and
2790
2791```bash
2792btcli wallet new_hotkey --wallet.name miner --wallet.hotkey default
2793```
2794
2795Create a coldkey and hotkey for your validator wallet:
2796
2797```bash
2798btcli wallet new_coldkey --wallet.name validator
2799```
2800
2801and
2802
2803```bash
2804btcli wallet new_hotkey --wallet.name validator --wallet.hotkey default
2805```
2806
2807## 3. Get the price of subnet creation
2808
2809Creating subnets on the testnet is competitive. The cost is determined by the rate at which new subnets are being registered onto the chain. 
2810
2811By default you must have at least 100 testnet TAO in your owner wallet to create a subnet. However, the exact amount will fluctuate based on demand. The below command shows how to get the current price of creating a subnet.
2812
2813```bash
2814btcli subnet lock_cost --subtensor.network test
2815```
2816
2817The above command will show:
2818
2819```bash
2820>> Subnet lock cost: τ100.000000000
2821```
2822
2823## 4. (Optional) Get faucet tokens
2824   
2825Faucet is disabled on the testnet. Hence, if you don't have sufficient faucet tokens, ask the [Bittensor Discord community](https://discord.com/channels/799672011265015819/830068283314929684) for faucet tokens.
2826
2827## 5. Purchase a slot
2828
2829Using the test TAO from the previous step you can register your subnet on the testnet. This will create a new subnet on the testnet and give you the owner permissions to it. 
2830
2831The below command shows how to purchase a slot. 
2832
2833**NOTE**: Slots cost TAO to lock. You will get this TAO back when the subnet is deregistered.
2834
2835```bash
2836btcli subnet create --subtensor.network test 
2837```
2838
2839Enter the owner wallet name which gives permissions to the coldkey:
2840
2841```bash
2842>> Enter wallet name (default): owner # Enter your owner wallet name
2843>> Enter password to unlock key: # Enter your wallet password.
2844>> Register subnet? [y/n]: <y/n> # Select yes (y)
2845>> ⠇ 📡 Registering subnet...
2846✅ Registered subnetwork with netuid: 1 # Your subnet netuid will show here, save this for later.
2847```
2848
2849## 6. Register keys
2850
2851This step registers your subnet validator and subnet miner keys to the subnet, giving them the **first two slots** on the subnet.
2852
2853Register your miner key to the subnet:
2854
2855```bash
2856btcli subnet recycle_register --netuid 13 --subtensor.network test --wallet.name miner --wallet.hotkey default
2857```
2858
2859Follow the below prompts:
2860
2861```bash
2862>> Enter netuid [1] (1): # Enter netuid 1 to specify the subnet you just created.
2863>> Continue Registration?
2864  hotkey:     ...
2865  coldkey:    ...
2866  network:    finney [y/n]: # Select yes (y)
2867>> ✅ Registered
2868```
2869
2870Next, register your validator key to the subnet:
2871
2872```bash
2873btcli subnet recycle_register --netuid 13 --subtensor.network test --wallet.name validator --wallet.hotkey default
2874```
2875
2876Follow the prompts:
2877
2878```bash
2879>> Enter netuid [1] (1): # Enter netuid 1 to specify the subnet you just created.
2880>> Continue Registration?
2881  hotkey:     ...
2882  coldkey:    ...
2883  network:    finney [y/n]: # Select yes (y)
2884>> ✅ Registered
2885```
2886
2887## 7. Check that your keys have been registered
2888
2889This step returns information about your registered keys.
2890
2891Check that your validator key has been registered:
2892
2893```bash
2894btcli wallet overview --wallet.name validator --subtensor.network test
2895```
2896
2897The above command will display the below:
2898
2899```bash
2900Subnet: 1                                                                                                                                                                
2901COLDKEY  HOTKEY   UID  ACTIVE  STAKE(τ)     RANK    TRUST  CONSENSUS  INCENTIVE  DIVIDENDS  EMISSION(ρ)   VTRUST  VPERMIT  UPDATED  AXON  HOTKEY_SS58                    
2902miner    default  0      True   0.00000  0.00000  0.00000    0.00000    0.00000    0.00000            0  0.00000                14  none  5GTFrsEQfvTsh3WjiEVFeKzFTc2xcf…
29031        1        2            τ0.00000  0.00000  0.00000    0.00000    0.00000    0.00000           ρ0  0.00000                                                         
2904                                                                          Wallet balance: τ0.0         
2905```
2906
2907Check that your miner has been registered:
2908
2909```bash
2910btcli wallet overview --wallet.name miner --subtensor.network test
2911```
2912
2913The above command will display the below:
2914
2915```bash
2916Subnet: 1                                                                                                                                                                
2917COLDKEY  HOTKEY   UID  ACTIVE  STAKE(τ)     RANK    TRUST  CONSENSUS  INCENTIVE  DIVIDENDS  EMISSION(ρ)   VTRUST  VPERMIT  UPDATED  AXON  HOTKEY_SS58                    
2918miner    default  1      True   0.00000  0.00000  0.00000    0.00000    0.00000    0.00000            0  0.00000                14  none  5GTFrsEQfvTsh3WjiEVFeKzFTc2xcf…
29191        1        2            τ0.00000  0.00000  0.00000    0.00000    0.00000    0.00000           ρ0  0.00000                                                         
2920                                                                          Wallet balance: τ0.0   
2921```
2922
2923## 8. Run subnet miner and subnet validator
2924
2925Run the subnet miner:
2926
2927```bash
2928python neurons/miner.py --netuid 1 --subtensor.network test --wallet.name miner --wallet.hotkey default --logging.debug
2929```
2930
2931You will see the below terminal output:
2932
2933```bash
2934>> 2023-08-08 16:58:11.223 |       INFO       | Running miner for subnet: 1 on network: ws://127.0.0.1:9946 with config: ...
2935```
2936
2937Next, run the subnet validator:
2938
2939```bash
2940python neurons/validator.py --netuid 1 --subtensor.network test --wallet.name validator --wallet.hotkey default --logging.debug
2941```
2942
2943You will see the below terminal output:
2944
2945```bash
2946>> 2023-08-08 16:58:11.223 |       INFO       | Running validator for subnet: 1 on network: ws://127.0.0.1:9946 with config: ...
2947```
2948
2949
2950## 9. Get emissions flowing
2951
2952Register to the root network using the `btcli`:
2953
2954```bash
2955btcli root register --subtensor.network test
2956```
2957
2958Then set your weights for the subnet:
2959
2960```bash
2961btcli root weights --subtensor.network test
2962```
2963
2964## 10. Stopping your nodes
2965
2966To stop your nodes, press CTRL + C in the terminal where the nodes are running.
2967
2968
2969
2970---
2971File: /docs/what_are_subnets.md
2972---
2973
2974# What is Bittensor?
2975Bittensor is a network where computers validate the work that other computers contribute to the network - the work what is most valuable to the collective will be rewarded
2976
2977Bittensor is a catalyst to the open-source developers and smaller AI research labs now have a financial incentive for fine-tuning open foundational models
2978
2979Bittensor is a library of machine intelligence that continuously grows and shares knowledge amongst peers
2980
2981# What is a subnet?
2982
2983Bittensor is releasing its own language for creating incentive mechanisms. This allows developers to build incentive systems on Bittensor, tapping into our web of intelligence to develop markets of the developer’s choosings  
2984
2985Subnet 1, an incentive system for machine intelligence production, showcases the enormous potential of markets to procure huge amounts of resources. Releasing user-created subnets is set to create a cambrian explosion of additional resources into the Bittensor ecosystem
2986
2987# Why should you care?
2988
2989As an open-source developer, you now have the ability to write your own incentive mechanisms without creating an entirely new chain. By tapping into Bittensor’s network of intelligence, you can incentivize AI models from all over the world to perform tasks of your choosing (i.e., image generation, storage, compute access, etc.) - the possibilities are truly endless
2990
2991The release of subnets also offers the potential to pull these tools into a shared network, making all the ingredients necessary to create intelligence available within one network, governed by one token
2992
2993You get to play a vital role in helping bootstrap what could one day become one of the most powerful networks in the world - and you make money by doing so!
2994
2995By incentivizing developers to create their own markets, Bittensor is set to become a  one-stop-shop for those seeking all the compute requirements for building unstoppable applications on top of an incentivized infrastructure
2996
2997# Deeper dive
2998Check out the Bittensor about page [here](https://bittensor.com/about) for more details about what the bittensor paradigm is and why subnets are revolutionary technology.
2999
3000Also see our [linktree](https://linktr.ee/opentensor) for more information.
3001
3002
3003---
3004File: /neurons/__init__.py
3005---
3006
3007
3008
3009
3010---
3011File: /neurons/miner.py
3012---
3013
3014# The MIT License (MIT)
3015# Copyright © 2023 Yuma Rao
3016# Copyright © 2023 Omega Labs, Inc.
3017
3018# Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated
3019# documentation files (the “Software”), to deal in the Software without restriction, including without limitation
3020# the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software,
3021# and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
3022
3023# The above copyright notice and this permission notice shall be included in all copies or substantial portions of
3024# the Software.
3025
3026# THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO
3027# THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
3028# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
3029# OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
3030# DEALINGS IN THE SOFTWARE.
3031
3032import os
3033# Set USE_TORCH=1 environment variable to use torch instead of numpy
3034os.environ["USE_TORCH"] = "1"
3035
3036import time
3037import json
3038import typing
3039import requests
3040import asyncio
3041import bittensor as bt
3042
3043# Bittensor Miner Template:
3044import omega
3045
3046from omega.base.miner import BaseMinerNeuron
3047from omega.imagebind_wrapper import ImageBind
3048from omega.miner_utils import search_and_diarize_youtube_videos, search_and_embed_youtube_videos
3049from omega.augment import LocalLLMAugment, OpenAIAugment, NoAugment
3050from omega.utils.config import QueryAugment
3051from omega.constants import VALIDATOR_TIMEOUT, VALIDATOR_TIMEOUT_AUDIO
3052from omega.diarization_pipeline import CustomDiarizationPipeline
3053
3054class Miner(BaseMinerNeuron):
3055    """
3056    Your miner neuron class. You should use this class to define your miner's behavior. In particular, you should replace the forward function with your own logic. You may also want to override the blacklist and priority functions according to your needs.
3057    """
3058    def __init__(self, config=None):
3059        super(Miner, self).__init__(config=config)
3060        query_augment_type = QueryAugment(self.config.neuron.query_augment)
3061        if query_augment_type == QueryAugment.NoAugment:
3062            self.augment = NoAugment(device=self.config.neuron.device)
3063        elif query_augment_type == QueryAugment.LocalLLMAugment:
3064            self.augment = LocalLLMAugment(device=self.config.neuron.device)
3065        elif query_augment_type == QueryAugment.OpenAIAugment:
3066            self.augment = OpenAIAugment(device=self.config.neuron.device)
3067        else:
3068            raise ValueError("Invalid query augment")
3069        
3070        
3071        self.diarization_pipeline = CustomDiarizationPipeline(
3072            overlap_detection_model_id = "tezuesh/overlapped-speech-detection",
3073            diarization_model_id="tezuesh/diarization",
3074            # device="cuda"
3075        )
3076        self.imagebind = ImageBind(v2=True)
3077
3078    async def forward_videos(
3079        self, synapse: omega.protocol.Videos
3080    ) :
3081        # Scrape Youtube videos
3082        bt.logging.info(f"Received scraping request: {synapse.num_videos} videos for query '{synapse.query}'")
3083        
3084        start = time.time()
3085        
3086        synapse.video_metadata = search_and_embed_youtube_videos(
3087            self.augment(synapse.query), synapse.num_videos, self.imagebind
3088        )
3089            
3090        time_elapsed = time.time() - start
3091            
3092        if len(synapse.video_metadata) == synapse.num_videos and time_elapsed < VALIDATOR_TIMEOUT:
3093            bt.logging.info(f"–––––– SCRAPING SUCCEEDED: Scraped {len(synapse.video_metadata)}/{synapse.num_videos} videos in {time_elapsed} seconds.")
3094        else:
3095            bt.logging.error(f"–––––– SCRAPING FAILED: Scraped {len(synapse.video_metadata)}/{synapse.num_videos} videos in {time_elapsed} seconds.")
3096
3097
3098        return synapse
3099    
3100    async def forward_audios(
3101        self, synapse: omega.protocol.Audios
3102    ) -> omega.protocol.Audios:
3103        bt.logging.info(f"Received youtube audio scraping and diarization request: {synapse.num_audios} audios for query '{synapse.query}'")
3104        
3105        start = time.time()
3106        
3107        synapse.audio_metadata = search_and_diarize_youtube_videos(
3108            self.augment(synapse.query), synapse.num_audios, self.diarization_pipeline, self.imagebind
3109        )
3110        
3111        time_elapsed = time.time() - start
3112            
3113        if len(synapse.audio_metadata) == synapse.num_audios and time_elapsed < VALIDATOR_TIMEOUT_AUDIO:
3114            bt.logging.info(f"–––––– SCRAPING SUCCEEDED: Scraped {len(synapse.audio_metadata)}/{synapse.num_audios} audios in {time_elapsed} seconds.")
3115        else:
3116            bt.logging.error(f"–––––– SCRAPING FAILED: Scraped {len(synapse.audio_metadata)}/{synapse.num_audios} audios in {time_elapsed} seconds.")
3117        return synapse
3118
3119    async def blacklist(
3120        self, synapse: bt.Synapse
3121    ) -> typing.Tuple[bool, str]:
3122        """
3123        Determines whether an incoming request should be blacklisted and thus ignored. Your implementation should
3124        define the logic for blacklisting requests based on your needs and desired security parameters.
3125
3126        Blacklist runs before the synapse data has been deserialized (i.e. before synapse.data is available).
3127        The synapse is instead contructed via the headers of the request. It is important to blacklist
3128        requests before they are deserialized to avoid wasting resources on requests that will be ignored.
3129
3130        Args:
3131            synapse (template.protocol.Videos): A synapse object constructed from the headers of the incoming request.
3132
3133        Returns:
3134            Tuple[bool, str]: A tuple containing a boolean indicating whether the synapse's hotkey is blacklisted,
3135                            and a string providing the reason for the decision.
3136
3137        This function is a security measure to prevent resource wastage on undesired requests. It should be enhanced
3138        to include checks against the metagraph for entity registration, validator status, and sufficient stake
3139        before deserialization of synapse data to minimize processing overhead.
3140
3141        Example blacklist logic:
3142        - Reject if the hotkey is not a registered entity within the metagraph.
3143        - Consider blacklisting entities that are not validators or have insufficient stake.
3144
3145        In practice it would be wise to blacklist requests from entities that are not validators, or do not have
3146        enough stake. This can be checked via metagraph.S and metagraph.validator_permit. You can always attain
3147        the uid of the sender via a metagraph.hotkeys.index( synapse.dendrite.hotkey ) call.
3148
3149        Otherwise, allow the request to be processed further.
3150        """
3151        if not synapse.dendrite.hotkey:
3152            return True, "Hotkey not provided"
3153        registered = synapse.dendrite.hotkey in self.metagraph.hotkeys
3154        if self.config.blacklist.allow_non_registered and not registered:
3155            return False, "Allowing un-registered hotkey"
3156        elif not registered:
3157            bt.logging.trace(
3158                f"Blacklisting un-registered hotkey {synapse.dendrite.hotkey}"
3159            )
3160            return True, f"Unrecognized hotkey {synapse.dendrite.hotkey}"
3161
3162        uid = self.metagraph.hotkeys.index(synapse.dendrite.hotkey)
3163        if self.config.blacklist.force_validator_permit:
3164            # If the config is set to force validator permit, then we should only allow requests from validators.
3165            if not self.metagraph.validator_permit[uid]:
3166                bt.logging.warning(
3167                    f"Blacklisting a request from non-validator hotkey {synapse.dendrite.hotkey}"
3168                )
3169                return True, "Non-validator hotkey"
3170
3171        stake = self.metagraph.S[uid].item()
3172        if self.config.blacklist.validator_min_stake and stake < self.config.blacklist.validator_min_stake:
3173            bt.logging.warning(f"Blacklisting request from {synapse.dendrite.hotkey} [uid={uid}], not enough stake -- {stake}")
3174            return True, "Stake below minimum"
3175
3176        bt.logging.trace(
3177            f"Not Blacklisting recognized hotkey {synapse.dendrite.hotkey}"
3178        )
3179        return False, "Hotkey recognized!"
3180    
3181    async def blacklist_videos(
3182        self, synapse: omega.protocol.Videos
3183    ) -> typing.Tuple[bool, str]:
3184        return await self.blacklist(synapse)
3185    
3186    async def blacklist_audios(
3187        self, synapse: omega.protocol.Audios
3188    ) -> typing.Tuple[bool, str]:
3189        return await self.blacklist(synapse)
3190
3191    async def priority(self, synapse: bt) -> float:
3192        """
3193        The priority function determines the order in which requests are handled. More valuable or higher-priority
3194        requests are processed before others. You should design your own priority mechanism with care.
3195
3196        This implementation assigns priority to incoming requests based on the calling entity's stake in the metagraph.
3197
3198        Args:
3199            synapse (template.protocol.Audios): The synapse object that contains metadata about the incoming request.
3200
3201        Returns:
3202            float: A priority score derived from the stake of the calling entity.
3203
3204        Miners may recieve messages from multiple entities at once. This function determines which request should be
3205        processed first. Higher values indicate that the request should be processed first. Lower values indicate
3206        that the request should be processed later.
3207
3208        Example priority logic:
3209        - A higher stake results in a higher priority value.
3210        """
3211        caller_uid = self.metagraph.hotkeys.index(
3212            synapse.dendrite.hotkey
3213        )  # Get the caller index.
3214        prirority = float(
3215            self.metagraph.S[caller_uid]
3216        )  # Return the stake as the priority.
3217        bt.logging.trace(
3218            f"Prioritizing {synapse.dendrite.hotkey} with value: ", prirority
3219        )
3220        return prirority
3221
3222    async def priority_videos(
3223        self, synapse: omega.protocol.Videos
3224    ) -> float:
3225        return await self.priority(synapse)
3226    
3227    async def priority_audios(
3228        self, synapse: omega.protocol.Audios
3229    ) -> float:
3230        return await self.priority(synapse)
3231    
3232    def save_state(self):
3233        """
3234        We define this function to avoid printing out the log message in the BaseNeuron class
3235        that says `save_state() not implemented`.
3236        """
3237        pass
3238
3239# This is the main function, which runs the miner.
3240if __name__ == "__main__":
3241    with Miner() as miner:
3242        while True:
3243            bt.logging.info("Miner running...", time.time())
3244            time.sleep(5)
3245
3246
3247
3248---
3249File: /neurons/test_miner.py
3250---
3251
3252from omega.miner_utils import search_and_embed_youtube_videos, ImageBind
3253from omega.constants import VALIDATOR_TIMEOUT
3254from omega.protocol import Videos
3255import time
3256import requests
3257
3258imagebind = ImageBind(v2=True)
3259start = time.time()
3260query = "wine and winemaking"
3261num_videos = 8
3262video_metadata_list = search_and_embed_youtube_videos(query, num_videos, imagebind)
3263time_elapsed = time.time() - start
3264
3265if time_elapsed > VALIDATOR_TIMEOUT or len(video_metadata_list) < num_videos:
3266    if time_elapsed > VALIDATOR_TIMEOUT:
3267        print(f"Searching took {time_elapsed} seconds, which is longer than the validator timeout of {VALIDATOR_TIMEOUT} seconds")
3268
3269    if len(video_metadata_list) < num_videos:
3270        print(f"Only got {len(video_metadata_list)} videos, which is less than the requested {num_videos} videos")
3271else:
3272    print(f"SUCCESS! Search and embed took {time_elapsed} seconds and got {len(video_metadata_list)} videos")
3273
3274
3275if len(video_metadata_list) == 0:
3276    print("No videos found")
3277else:
3278    videos = Videos(query=query, num_videos=num_videos, video_metadata=video_metadata_list)
3279    response = requests.get(
3280        "https://dev-validator.api.omega-labs.ai/api/count_unique",
3281        json=videos.to_serializable_dict(videos)
3282    )
3283    print(response.json())
3284
3285
3286
3287---
3288File: /neurons/validator.py
3289---
3290
3291# The MIT License (MIT)
3292# Copyright © 2023 Omega Labs, Inc.
3293
3294# Copyright © 2023 Yuma Rao
3295# Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated
3296# documentation files (the “Software”), to deal in the Software without restriction, including without limitation
3297# the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software,
3298# and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
3299
3300# The above copyright notice and this permission notice shall be included in all copies or substantial portions of
3301# the Software.
3302
3303# THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO
3304# THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
3305# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
3306# OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
3307# DEALINGS IN THE SOFTWARE.
3308
3309import os
3310# Set USE_TORCH=1 environment variable to use torch instead of numpy
3311os.environ["USE_TORCH"] = "1"
3312
3313from aiohttp import ClientSession, BasicAuth
3314import asyncio
3315from typing import List, Tuple, Optional, BinaryIO, Dict
3316import datetime as dt
3317import random
3318import traceback
3319import requests
3320import math
3321import soundfile as sf
3322from io import BytesIO
3323import json
3324import numpy as np
3325# Bittensor
3326import bittensor as bt
3327import torch
3328import torch.nn.functional as F
3329from torch.nn import CosineSimilarity
3330import wandb
3331import base64
3332# Bittensor Validator Template:
3333from omega.utils.uids import get_random_uids
3334from omega.protocol import Videos, VideoMetadata, AudioMetadata, Audios
3335from omega.constants import (
3336    VALIDATOR_TIMEOUT,
3337    VALIDATOR_TIMEOUT_MARGIN,
3338    VALIDATOR_TIMEOUT_AUDIO,
3339    MAX_VIDEO_LENGTH, 
3340    MIN_VIDEO_LENGTH,
3341    CHECK_PROBABILITY,
3342    DIFFERENCE_THRESHOLD, 
3343    SIMILARITY_THRESHOLD, 
3344    VIDEO_DOWNLOAD_TIMEOUT, 
3345    MIN_SCORE, 
3346    FAKE_VIDEO_PUNISHMENT,
3347    QUERY_RELEVANCE_SCALING_FACTOR,
3348    DESCRIPTION_RELEVANCE_SCALING_FACTOR,
3349    VIDEO_RELEVANCE_WEIGHT,
3350    FOCUS_REWARDS_PERCENT,
3351    AUDIO_REWARDS_PERCENT,
3352    DESCRIPTION_LENGTH_WEIGHT,
3353    MIN_LENGTH_BOOST_TOKEN_COUNT,
3354    MAX_LENGTH_BOOST_TOKEN_COUNT,
3355    STUFFED_DESCRIPTION_PUNISHMENT,
3356    FOCUS_MIN_SCORE,
3357    MIN_AUDIO_LENGTH_SECONDS,
3358    MAX_AUDIO_LENGTH_SECONDS,
3359    MIN_AUDIO_LENGTH_SCORE,
3360    SPEECH_CONTENT_SCALING_FACTOR,
3361    SPEAKER_DOMINANCE_SCALING_FACTOR,
3362    BACKGROUND_NOISE_SCALING_FACTOR,
3363    UNIQUE_SPEAKERS_ERROR_SCALING_FACTOR,
3364    AUDIO_LENGTH_SCALING_FACTOR,
3365    AUDIO_QUALITY_SCALING_FACTOR,
3366    DIARIZATION_SCALING_FACTOR,
3367    AUDIO_QUERY_RELEVANCE_SCALING_FACTOR
3368)
3369from omega import video_utils, unstuff
3370from omega.imagebind_wrapper import ImageBind, Embeddings, run_async, LENGTH_TOKENIZER
3371from omega.text_similarity import get_text_similarity_score
3372from omega.diarization_metric import calculate_diarization_metrics
3373from omega.audio_scoring import AudioScore
3374
3375# import base validator class which takes care of most of the boilerplate
3376from omega.base.validator import BaseValidatorNeuron
3377
3378NO_RESPONSE_MINIMUM = 0.005
3379GPU_SEMAPHORE = asyncio.Semaphore(1)
3380DOWNLOAD_SEMAPHORE = asyncio.Semaphore(5)
3381
3382class Validator(BaseValidatorNeuron):
3383    """
3384    Your validator neuron class. You should use this class to define your validator's behavior. In particular, you should replace the forward function with your own logic.
3385
3386    This class inherits from the BaseValidatorNeuron class, which in turn inherits from BaseNeuron. The BaseNeuron class takes care of routine tasks such as setting up wallet, subtensor, metagraph, logging directory, parsing config, etc. You can override any of the methods in BaseNeuron if you need to customize the behavior.
3387
3388    This class provides reasonable default behavior for a validator such as keeping a moving average of the scores of the miners and using them to set weights at the end of each epoch. Additionally, the scores are reset for new hotkeys at the end of each epoch.
3389    """
3390
3391    def __init__(self, config=None):
3392        super(Validator, self).__init__(config=config)
3393        self.audio_score = AudioScore()
3394        bt.logging.info("load_state()")
3395        self.load_state()
3396        self.successfully_started_wandb = False
3397
3398        if not self.config.wandb.off:
3399            if os.getenv("WANDB_API_KEY"):
3400                self.new_wandb_run()
3401                self.successfully_started_wandb = True
3402            else:
3403                bt.logging.exception("WANDB_API_KEY not found. Set it with `export WANDB_API_KEY=<your API key>`. Alternatively, you can disable W&B with --wandb.off, but it is strongly recommended to run with W&B enabled.")
3404                self.successfully_started_wandb = False
3405        else:
3406            bt.logging.warning("Running with --wandb.off. It is strongly recommended to run with W&B enabled.")
3407            self.successfully_started_wandb = False
3408        
3409        api_root = (
3410            "https://dev-validator.api.omega-labs.ai"
3411            if self.config.subtensor.network == "test" else
3412            "https://validator.api.omega-labs.ai"
3413        )
3414        self.validation_endpoint = f"{api_root}/api/validate"
3415        self.proxy_endpoint = f"{api_root}/api/get_proxy"
3416        self.novelty_scores_endpoint = f"{api_root}/api/get_pinecone_novelty"
3417        self.upload_video_metadata_endpoint = f"{api_root}/api/upload_video_metadata"
3418        self.upload_audio_metadata_endpoint = f"{api_root}/api/upload_audio_metadata"
3419        self.focus_rewards_percent_endpoint = f"{api_root}/api/focus/get_rewards_percent"
3420        self.focus_miner_purchases_endpoint = f"{api_root}/api/focus/miner_purchase_scores"
3421        self.num_videos = 8
3422        self.num_audios = 4
3423        self.client_timeout_seconds = VALIDATOR_TIMEOUT + VALIDATOR_TIMEOUT_MARGIN
3424        self.client_timeout_seconds_audio = VALIDATOR_TIMEOUT_AUDIO + VALIDATOR_TIMEOUT_MARGIN
3425        # load topics from topics URL (CSV) or fallback to local topics file
3426        self.load_topics_start = dt.datetime.now()
3427        self.all_topics = self.load_topics()
3428
3429        self.imagebind = None
3430        
3431        self.load_focus_rewards_start = dt.datetime.now()
3432        self.FOCUS_REWARDS_PERCENT = self.load_focus_rewards_percent() # 2.5%
3433        self.AUDIO_REWARDS_PERCENT = AUDIO_REWARDS_PERCENT # 12.5%
3434        self.YOUTUBE_REWARDS_PERCENT = 1.0 - self.FOCUS_REWARDS_PERCENT - self.AUDIO_REWARDS_PERCENT # 85%
3435
3436        if not self.config.neuron.decentralization.off:
3437            if torch.cuda.is_available():
3438                bt.logging.info(f"Running with decentralization enabled, thank you Bittensor Validator!")
3439                self.decentralization = True
3440                self.imagebind = ImageBind(v2=True)
3441            else:
3442                bt.logging.warning(f"Attempting to run decentralization, but no GPU found. Please see min_compute.yml for minimum resource requirements.")
3443                self.decentralization = False
3444        else:
3445            bt.logging.warning("Running with --decentralization.off. It is strongly recommended to run with decentralization enabled.")
3446            self.decentralization = False
3447    
3448
3449    def new_wandb_run(self):
3450        # Shoutout SN13 for the wandb snippet!
3451        """Creates a new wandb run to save information to."""
3452        # Create a unique run id for this run.
3453        now = dt.datetime.now()
3454        self.wandb_run_start = now
3455        run_id = now.strftime("%Y-%m-%d_%H-%M-%S")
3456        name = "validator-" + str(self.uid) + "-" + run_id
3457        self.wandb_run = wandb.init(
3458            name=name,
3459            project="omega-sn24-validator-logs",
3460            entity="omega-labs",
3461            config={
3462                "uid": self.uid,
3463                "hotkey": self.wallet.hotkey.ss58_address,
3464                "run_name": run_id,
3465                "type": "validator",
3466            },
3467            allow_val_change=True,
3468            anonymous="allow",
3469        )
3470
3471        bt.logging.debug(f"Started a new wandb run: {name}")
3472
3473    def load_topics(self):
3474        # get topics from CSV URL and load them into our topics list
3475        try:
3476            response = requests.get(self.config.topics_url)
3477            response.raise_for_status()
3478            # split the response text into a list of topics and trim any whitespace
3479            all_topics = [line.strip() for line in response.text.split("\n")]
3480            bt.logging.info(f"Loaded {len(all_topics)} topics from {self.config.topics_url}")
3481        except Exception as e:
3482            bt.logging.error(f"Error loading topics from URL {self.config.topics_url}: {e}")
3483            traceback.print_exc()
3484            bt.logging.info(f"Using fallback topics from {self.config.topics_path}")
3485            all_topics = [line.strip() for line in open(self.config.topics_path) if line.strip()]
3486            bt.logging.info(f"Loaded {len(all_topics)} topics from {self.config.topics_path}")
3487        return all_topics
3488    
3489    def load_focus_rewards_percent(self):
3490        # get focus rewards percent from API endpoint or fallback to default
3491        try:
3492            response = requests.get(self.focus_rewards_percent_endpoint)
3493            response.raise_for_status()
3494            rewards_percent = float(response.text)
3495            bt.logging.info(f"Loaded focus rewards percent of {rewards_percent} from {self.focus_rewards_percent_endpoint}")
3496        except Exception as e:
3497            bt.logging.error(f"Error loading topics from URL {self.config.topics_url}: {e}")
3498            traceback.print_exc()
3499            bt.logging.info(f"Using fallback focus rewards percent of {FOCUS_REWARDS_PERCENT}")
3500            rewards_percent = FOCUS_REWARDS_PERCENT
3501        return rewards_percent
3502
3503    async def forward(self):
3504        """
3505        Validator forward pass. Consists of:
3506        - Generating the query
3507        - Querying the miners
3508        - Getting the responses
3509        - Rewarding the miners
3510        - Updating the scores
3511        """
3512        """
3513        The forward function is called by the validator every time step.
3514
3515        It is responsible for querying the network and scoring the responses.
3516
3517        Args:
3518            self (:obj:`bittensor.neuron.Neuron`): The neuron object which contains all the necessary state for the validator.
3519
3520        """
3521        miner_uids = get_random_uids(self, k=self.config.neuron.sample_size)
3522        # miner_uids = torch.LongTensor([0])
3523
3524        if len(miner_uids) == 0:
3525            bt.logging.info("No miners available")
3526            return
3527        
3528        """ START YOUTUBE AUDIO PROCESSING AND SCORING """
3529        bt.logging.info("===== YOUTUBE REQUESTS, AUDIO PROCESSING, AND SCORING =====")
3530        # The dendrite client queries the network.
3531        query = random.choice(self.all_topics) + " podcast"
3532        bt.logging.info(f"Sending query '{query}' to miners {miner_uids}")
3533        audio_input_synapse = Audios(query=query, num_audios=self.num_audios)
3534        bt.logging.info(f"audio_input_synapse: {audio_input_synapse}")
3535        # exit(0)
3536        axons = [self.metagraph.axons[uid] for uid in miner_uids]
3537        audio_responses = await self.dendrite(
3538            # Send the query to selected miner axons in the network.
3539            axons=axons,
3540            synapse=audio_input_synapse,
3541            deserialize=False,
3542            timeout=self.client_timeout_seconds_audio,
3543        )
3544        audio_working_miner_uids = []
3545        audio_finished_responses = []
3546
3547        for response in audio_responses:
3548            if response.audio_metadata is None or not response.axon or not response.axon.hotkey:
3549                continue
3550
3551            uid = [uid for uid, axon in zip(miner_uids, axons) if axon.hotkey == response.axon.hotkey][0]
3552            audio_working_miner_uids.append(uid)
3553            audio_finished_responses.append(response)
3554
3555        if len(audio_working_miner_uids) == 0:
3556            bt.logging.info("No miner responses available for audio synapse")
3557        
3558        # Log the results for monitoring purposes.
3559        bt.logging.info(f"Received audio responses: {audio_responses}")
3560        # Adjust the scores based on responses from miners.
3561        try:
3562            audio_rewards_list = await self.handle_checks_and_reward_audio(input_synapse=audio_input_synapse, responses=audio_finished_responses)
3563        except Exception as e:
3564            bt.logging.error(f"Error in handle_checks_and_rewards_audio: {e}")
3565            traceback.print_exc()
3566            return
3567        
3568        audio_rewards = []
3569        audio_reward_uids = []
3570        for r, r_uid in zip(audio_rewards_list, audio_working_miner_uids):
3571            if r is not None:
3572                audio_rewards.append(r)
3573                audio_reward_uids.append(r_uid)
3574        audio_rewards = torch.FloatTensor(audio_rewards).to(self.device)
3575        self.update_audio_scores(audio_rewards, audio_reward_uids)
3576        
3577        # give min reward to miners who didn't respond
3578        bad_miner_uids = [uid for uid in miner_uids if uid not in audio_working_miner_uids]
3579        penalty_tensor = torch.FloatTensor([NO_RESPONSE_MINIMUM] * len(bad_miner_uids)).to(self.device)
3580        self.update_audio_scores(penalty_tensor, bad_miner_uids)
3581
3582        for reward, miner_uid in zip(audio_rewards, audio_reward_uids):
3583            bt.logging.info(f"Rewarding miner={miner_uid} with reward={reward} for audio dataset")
3584        
3585        for penalty, miner_uid in zip(penalty_tensor, bad_miner_uids):
3586            bt.logging.info(f"Penalizing miner={miner_uid} with penalty={penalty}")
3587
3588        """ END YOUTUBE AUDIO PROCESSING AND SCORING """
3589
3590        """ START YOUTUBE SYNAPSE REQUESTS, PROCESSING, AND SCORING """
3591        bt.logging.info("===== YOUTUBE REQUESTS, PROCESSING, AND SCORING =====") 
3592        # The dendrite client queries the network.
3593        query = random.choice(self.all_topics)
3594        bt.logging.info(f"Sending query '{query}' to miners {miner_uids}")
3595        input_synapse = Videos(query=query, num_videos=self.num_videos)
3596        axons = [self.metagraph.axons[uid] for uid in miner_uids]
3597        responses = await self.dendrite(
3598            # Send the query to selected miner axons in the network.
3599            axons=axons,
3600            synapse=input_synapse,
3601            deserialize=False,
3602            timeout=self.client_timeout_seconds,
3603        )
3604        
3605        working_miner_uids = []
3606        finished_responses = []
3607
3608        for response in responses:
3609            if response.video_metadata is None or not response.axon or not response.axon.hotkey:
3610                continue
3611
3612            uid = [uid for uid, axon in zip(miner_uids, axons) if axon.hotkey == response.axon.hotkey][0]
3613            working_miner_uids.append(uid)
3614            finished_responses.append(response)
3615
3616        if len(working_miner_uids) == 0:
3617            bt.logging.info("No miner responses available")
3618        
3619        # Log the results for monitoring purposes.
3620        bt.logging.info(f"Received video responses: {responses}")
3621
3622        # Adjust the scores based on responses from miners.
3623        try:
3624            # Check if this validator is running decentralization
3625            if not self.decentralization:
3626                # if not, use validator API get_rewards system
3627                rewards_list = await self.get_rewards(input_synapse=input_synapse, responses=finished_responses)
3628            else:
3629                # if so, use decentralization logic with local GPU
3630                rewards_list = await self.handle_checks_and_rewards_youtube(input_synapse=input_synapse, responses=finished_responses)
3631        except Exception as e:
3632            bt.logging.error(f"Error in handle_checks_and_rewards_youtube: {e}")
3633            traceback.print_exc()
3634            return
3635
3636        # give reward to all miners who responded and had a non-null reward
3637        rewards = []
3638        reward_uids = []
3639        for r, r_uid in zip(rewards_list, working_miner_uids):
3640            if r is not None:
3641                rewards.append(r)
3642                reward_uids.append(r_uid)
3643        rewards = torch.FloatTensor(rewards).to(self.device)
3644        self.update_scores(rewards, reward_uids)
3645        
3646        # give min reward to miners who didn't respond
3647        bad_miner_uids = [uid for uid in miner_uids if uid not in working_miner_uids]
3648        penalty_tensor = torch.FloatTensor([NO_RESPONSE_MINIMUM] * len(bad_miner_uids)).to(self.device)
3649        self.update_scores(penalty_tensor, bad_miner_uids)
3650
3651        for reward, miner_uid in zip(rewards, reward_uids):
3652            bt.logging.info(f"Rewarding miner={miner_uid} with reward={reward}")
3653        
3654        for penalty, miner_uid in zip(penalty_tensor, bad_miner_uids):
3655            bt.logging.info(f"Penalizing miner={miner_uid} with penalty={penalty}")
3656        """ END YOUTUBE SYNAPSE REQUESTS, PROCESSING, AND SCORING """
3657
3658        """ START FOCUS VIDEOS PROCESSING AND SCORING """
3659        bt.logging.info("===== FOCUS VIDEOS PROCESSING AND SCORING =====")
3660        # Gather all focus videos purchased by the subset of miners
3661        focus_miner_uids = []
3662        focus_miner_hotkeys = []
3663        
3664        # Get all the focus videos by iteratively calling the get_focus_videos() function.
3665        miner_hotkeys = []
3666        for miner_uid in miner_uids:
3667            miner_hotkeys.append(self.metagraph.hotkeys[miner_uid])
3668        focus_videos = await self.get_focus_videos(miner_hotkeys, miner_uids)
3669
3670        # Check responses and mark which miner uids and hotkeys have focus videos
3671        for focus_video in focus_videos:
3672            if focus_video and focus_video is not None and 'purchased_videos' in focus_video:
3673                focus_miner_uids.append(focus_video['miner_uid'])
3674                focus_miner_hotkeys.append(focus_video['miner_hotkey'])
3675
3676        if focus_videos is None or len(focus_miner_uids) == 0:
3677            bt.logging.info("No focus videos found for miners.")
3678            return
3679        
3680        focus_rewards_list = await self.handle_checks_and_rewards_focus(focus_videos=focus_videos)
3681        # give reward to all miners with focus videos and had a non-null reward
3682        focus_rewards = []
3683        focus_reward_uids = []
3684        for r, r_uid in zip(focus_rewards_list, focus_miner_uids):
3685            if r is not None:
3686                focus_rewards.append(r)
3687                focus_reward_uids.append(r_uid)
3688        focus_rewards = torch.FloatTensor(focus_rewards).to(self.device)
3689        self.update_focus_scores(focus_rewards, focus_reward_uids)
3690
3691        # set focus score to 0 for miners who don't have any focus videos
3692        no_focus_videos_miner_uids = [uid for uid in miner_uids if uid not in focus_reward_uids]
3693        no_rewards_tensor = torch.FloatTensor([FOCUS_MIN_SCORE] * len(no_focus_videos_miner_uids)).to(self.device)
3694        self.update_focus_scores(no_rewards_tensor, no_focus_videos_miner_uids)
3695
3696        for reward, miner_uid in zip(focus_rewards, focus_reward_uids):
3697            bt.logging.info(f"Rewarding miner={miner_uid} with reward={reward} for focus videos")
3698
3699        for no_reward, miner_uid in zip(no_rewards_tensor, no_focus_videos_miner_uids):
3700            bt.logging.info(f"Scoring miner={miner_uid} with reward={no_reward} for no focus videos")
3701        """ END FOCUS VIDEOS PROCESSING AND SCORING """
3702
3703
3704        
3705
3706    def metadata_check(self, metadata: List[VideoMetadata]) -> List[VideoMetadata]:
3707        return [
3708            video_metadata for video_metadata in metadata
3709            if (
3710                video_metadata.end_time - video_metadata.start_time <= MAX_VIDEO_LENGTH and
3711                video_metadata.end_time - video_metadata.start_time >= MIN_VIDEO_LENGTH
3712            )
3713        ]
3714    
3715    def audio_metadata_check(self, metadata: List[AudioMetadata]) -> List[AudioMetadata]:
3716        return [
3717            audio_metadata for audio_metadata in metadata
3718            if (
3719                audio_metadata.end_time - audio_metadata.start_time <= MAX_VIDEO_LENGTH and
3720                audio_metadata.end_time - audio_metadata.start_time >= MIN_VIDEO_LENGTH
3721            )
3722        ]
3723    
3724    
3725    def filter_embeddings(self, embeddings: Embeddings, is_too_similar: List[bool]) -> Embeddings:
3726        """Filter the embeddings based on whether they are too similar to the query."""
3727        is_too_similar = torch.tensor(is_too_similar)
3728        if embeddings.video is not None:
3729            embeddings.video = embeddings.video[~is_too_similar]
3730        if embeddings.audio is not None:
3731            embeddings.audio = embeddings.audio[~is_too_similar]
3732        if embeddings.description is not None:
3733            embeddings.description = embeddings.description[~is_too_similar]
3734        return embeddings
3735
3736    def filter_stuffed_embeddings(self, embeddings: Embeddings, stuffed: List[Tuple[bool, float]]) -> Embeddings:
3737        """Filter the embeddings based on whether they are too similar to the query."""
3738        stuffed = torch.tensor([s for s, _ in stuffed])
3739        if embeddings.video is not None:
3740            embeddings.video = embeddings.video[~stuffed]
3741        if embeddings.audio is not None:
3742            embeddings.audio = embeddings.audio[~stuffed]
3743        if embeddings.description is not None:
3744            embeddings.description = embeddings.description[~stuffed]
3745        return embeddings
3746    
3747    async def deduplicate_videos(self, embeddings: Embeddings) -> Videos:
3748        # return a list of booleans where True means the corresponding video is a duplicate i.e. is_similar
3749        video_tensor = embeddings.video
3750        num_videos = video_tensor.shape[0]
3751        cossim = CosineSimilarity(dim=1)
3752        is_similar = []
3753        for i in range(num_videos):
3754            similarity_score = cossim(video_tensor[[i]], video_tensor[i + 1:])
3755            has_duplicates = (similarity_score > SIMILARITY_THRESHOLD).any()
3756            is_similar.append(has_duplicates.item())
3757        
3758        return is_similar
3759
3760    async def deduplicate_audios(self, embeddings: Embeddings) -> Audios:
3761        # return a list of booleans where True means the corresponding video is a duplicate i.e. is_similar
3762        audio_tensor = embeddings.audio
3763        num_audios = audio_tensor.shape[0]
3764        cossim = CosineSimilarity(dim=1)
3765        is_similar = []
3766        for i in range(num_audios):
3767            similarity_score = cossim(audio_tensor[[i]], audio_tensor[i + 1:])
3768            has_duplicates = (similarity_score > SIMILARITY_THRESHOLD).any()
3769            is_similar.append(has_duplicates.item())
3770        
3771        return is_similar
3772    
3773    def is_similar(self, emb_1: torch.Tensor, emb_2: List[float]) -> bool:
3774        return F.cosine_similarity(
3775            emb_1,
3776            torch.tensor(emb_2, device=emb_1.device).unsqueeze(0)
3777        ) > SIMILARITY_THRESHOLD
3778
3779    def strict_is_similar(self, emb_1: torch.Tensor, emb_2: List[float]) -> bool:
3780        return torch.allclose(emb_1, torch.tensor(emb_2, device=emb_1.device), atol=1e-4)
3781    
3782    async def get_random_youtube_video(
3783        self,
3784        metadata,
3785        check_video: bool
3786    ):
3787        if not check_video and len(metadata) > 0:
3788            random_metadata = random.choice(metadata)
3789            return random_metadata, None
3790
3791        random_video = None
3792        metadata_copy = [v for v in metadata]  # list shallow copy
3793        while random_video is None and len(metadata_copy) > 0:
3794            idx = random.randint(0, len(metadata_copy) - 1)
3795            random_metadata = metadata_copy.pop(idx)
3796            proxy_url = await self.get_proxy_url()
3797            if proxy_url is None:
3798                bt.logging.info("Issue getting proxy_url from API, not using proxy. Attempting download for random_video check")
3799            else:
3800                bt.logging.info("Got proxy_url from API. Attempting download for random_video check")
3801            try:
3802                async with DOWNLOAD_SEMAPHORE:
3803                    random_video = await asyncio.wait_for(run_async(
3804                        video_utils.download_youtube_video,
3805                        random_metadata.video_id,
3806                        random_metadata.start_time,
3807                        random_metadata.end_time,
3808                        proxy=proxy_url
3809                    ), timeout=VIDEO_DOWNLOAD_TIMEOUT)
3810            except video_utils.IPBlockedException:
3811                # IP is blocked, cannot download video, check description only
3812                bt.logging.warning("WARNING: IP is blocked, cannot download video, checking description only")
3813                return random_metadata, None
3814            except video_utils.FakeVideoException:
3815                bt.logging.warning(f"WARNING: Video {random_metadata.video_id} is fake, punishing miner")
3816                return None
3817            except asyncio.TimeoutError:
3818                continue
3819
3820        # IP is not blocked, video is not fake, but video download failed for some reason. We don't
3821        # know why it failed so we won't punish the miner, but we will check the description only.
3822        if random_video is None:
3823            return random_metadata, None
3824
3825        return random_metadata, random_video
3826
3827    async def random_youtube_check(self, random_meta_and_vid: List[VideoMetadata]) -> bool:
3828        random_metadata, random_video = random_meta_and_vid
3829
3830        if random_video is None:
3831            desc_embeddings = self.imagebind.embed_text([random_metadata.description])
3832            is_similar_ = self.is_similar(desc_embeddings, random_metadata.description_emb)
3833            strict_is_similar_ = self.strict_is_similar(desc_embeddings, random_metadata.description_emb)
3834            bt.logging.info(f"Description similarity: {is_similar_}, strict description similarity: {strict_is_similar_}")
3835            return is_similar_
3836
3837        # Video downloaded, check all embeddings
3838        embeddings = self.imagebind.embed([random_metadata.description], [random_video])
3839        is_similar_ = (
3840            self.is_similar(embeddings.video, random_metadata.video_emb) and
3841            self.is_similar(embeddings.audio, random_metadata.audio_emb) and
3842            self.is_similar(embeddings.description, random_metadata.description_emb)
3843        )
3844        strict_is_similar_ = (
3845            self.strict_is_similar(embeddings.video, random_metadata.video_emb) and
3846            self.strict_is_similar(embeddings.audio, random_metadata.audio_emb) and
3847            self.strict_is_similar(embeddings.description, random_metadata.description_emb)
3848        )
3849        bt.logging.debug(f"Total similarity: {is_similar_}, strict total similarity: {strict_is_similar_}")
3850        return is_similar_
3851    
3852
3853    async def random_audio_check(self, random_meta_and_audio: List[AudioMetadata]) -> bool:
3854        random_metadata, random_video = random_meta_and_audio
3855        bt.logging.info(f"inside random_audio_check, random_metadata: {random_metadata}, random_video: {random_video}")
3856        if random_video is None:
3857            return True
3858        
3859        audio_bytes_from_youtube = video_utils.get_audio_bytes(random_video.name)
3860        audio_bytes_from_youtube = base64.b64encode(audio_bytes_from_youtube).decode('utf-8')
3861        audio_array_youtube, _ = sf.read(BytesIO(base64.b64decode(audio_bytes_from_youtube)))
3862        submitted_audio_bytes = random_metadata.audio_bytes
3863        audio_array_submitted, _ = sf.read(BytesIO(base64.b64decode(submitted_audio_bytes)))
3864        
3865                
3866        if np.array_equal(audio_array_youtube, audio_array_submitted) is False:
3867            bt.logging.warning("WARNING: Audio bytes do not match")
3868            return False
3869        return True
3870        
3871    def compute_novelty_score_among_batch(self, emb: Embeddings) -> List[float]:
3872        video_tensor = emb.video
3873        num_videos = video_tensor.shape[0]
3874        novelty_scores = []
3875        for i in range(num_videos - 1):
3876            similarity_score = F.cosine_similarity(video_tensor[[i]], video_tensor[i + 1:]).max()
3877            novelty_scores.append(1 - similarity_score.item())
3878        novelty_scores.append(1.0)  # last video is 100% novel
3879        return novelty_scores
3880    
3881    def compute_novelty_score_among_batch_audio(self, emb: Embeddings) -> List[float]:
3882        audio_tensor = emb.audio
3883        num_audios = audio_tensor.shape[0]
3884        novelty_scores = []
3885        for i in range(num_audios - 1):
3886            similarity_score = F.cosine_similarity(audio_tensor[[i]], audio_tensor[i + 1:]).max()
3887            novelty_scores.append(1 - similarity_score.item())
3888        novelty_scores.append(1.0)  # last video is 100% novel
3889        return novelty_scores
3890
3891    async def async_zero() -> None:
3892        return 0
3893
3894    # algorithm for computing final novelty score
3895    def compute_final_novelty_score(self, base_novelty_scores: List[float]) -> float:
3896        is_too_similar = [score < DIFFERENCE_THRESHOLD for score in base_novelty_scores]
3897        novelty_score = sum([
3898            score for score, is_too_similar
3899            in zip(base_novelty_scores, is_too_similar) if not is_too_similar
3900        ])
3901        return novelty_score
3902    
3903    async def check_videos_and_calculate_rewards_youtube(
3904        self,
3905        input_synapse: Videos,
3906        videos: Videos
3907    ) -> Optional[float]:
3908        try:
3909            # return minimum score if no videos were found in video_metadata
3910            if len(videos.video_metadata) == 0:
3911                return MIN_SCORE
3912
3913            # check video_ids for fake videos
3914            if any(not video_utils.is_valid_youtube_id(video.video_id) for video in videos.video_metadata):
3915                return FAKE_VIDEO_PUNISHMENT
3916
3917            # check and filter duplicate metadata
3918            metadata = self.metadata_check(videos.video_metadata)[:input_synapse.num_videos]
3919            if len(metadata) < len(videos.video_metadata):
3920                bt.logging.info(f"Filtered {len(videos.video_metadata)} videos down to {len(metadata)} videos")
3921
3922            # if randomly tripped, flag our random check to pull a video from miner's submissions
3923            check_video = CHECK_PROBABILITY > random.random()
3924            
3925            # pull a random video and/or description only
3926            random_meta_and_vid = await self.get_random_youtube_video(metadata, check_video)
3927            if random_meta_and_vid is None:
3928                return FAKE_VIDEO_PUNISHMENT
3929
3930            # execute the random check on metadata and video
3931            async with GPU_SEMAPHORE:
3932                passed_check = await self.random_youtube_check(random_meta_and_vid)
3933
3934                # punish miner if not passing
3935                if not passed_check:
3936                    return FAKE_VIDEO_PUNISHMENT
3937                query_emb = await self.imagebind.embed_text_async([videos.query])
3938
3939            embeddings = Embeddings(
3940                video=torch.stack([torch.tensor(v.video_emb) for v in metadata]).to(self.imagebind.device),
3941                audio=torch.stack([torch.tensor(v.audio_emb) for v in metadata]).to(self.imagebind.device),
3942                description=torch.stack([torch.tensor(v.description_emb) for v in metadata]).to(self.imagebind.device),
3943            )
3944
3945            # check and deduplicate videos based on embedding similarity checks. We do this because we're not uploading to pinecone first.
3946            metadata_is_similar = await self.deduplicate_videos(embeddings)
3947            metadata = [metadata for metadata, too_similar in zip(metadata, metadata_is_similar) if not too_similar]
3948            embeddings = self.filter_embeddings(embeddings, metadata_is_similar)
3949            if len(metadata) < len(videos.video_metadata):
3950                bt.logging.info(f"Deduplicated {len(videos.video_metadata)} videos down to {len(metadata)} videos")
3951
3952            # return minimum score if no unique videos were found
3953            if len(metadata) == 0:
3954                return MIN_SCORE
3955            
3956            # first get local novelty scores
3957            local_novelty_scores = self.compute_novelty_score_among_batch(embeddings)
3958            #bt.logging.debug(f"local_novelty_scores: {local_novelty_scores}")
3959            # second get the novelty scores from the validator api if not already too similar
3960            embeddings_to_check = [
3961                (embedding, metadata)
3962                for embedding, local_score, metadata in zip(embeddings.video, local_novelty_scores, metadata)
3963                if local_score >= DIFFERENCE_THRESHOLD
3964            ]
3965            # If there are embeddings to check, call get_novelty_scores once
3966            if embeddings_to_check:
3967                embeddings_to_check, metadata_to_check = zip(*embeddings_to_check)
3968                global_novelty_scores = await self.get_novelty_scores(metadata_to_check)
3969            else:
3970                # If no embeddings to check, return an empty list or appropriate default value
3971                global_novelty_scores = []
3972
3973            if global_novelty_scores is None or len(global_novelty_scores) == 0:
3974                bt.logging.error("Issue retrieving global novelty scores, returning None.")
3975                return None
3976            # #bt.logging.debug(f"global_novelty_scores: {global_novelty_scores}")
3977            
3978            # calculate true novelty scores between local and global
3979            true_novelty_scores = [
3980                min(local_score, global_score) for local_score, global_score
3981                in zip(local_novelty_scores, global_novelty_scores)
3982            ]
3983            #bt.logging.debug(f"true_novelty_scores: {true_novelty_scores}")
3984
3985            pre_filter_metadata_length = len(metadata)
3986            # check scores from index for being too similar
3987            is_too_similar = [score < DIFFERENCE_THRESHOLD for score in true_novelty_scores]
3988            # filter out metadata too similar
3989            metadata = [metadata for metadata, too_similar in zip(metadata, is_too_similar) if not too_similar]
3990            # filter out embeddings too similar
3991            embeddings = self.filter_embeddings(embeddings, is_too_similar)
3992            if len(metadata) < pre_filter_metadata_length:
3993                bt.logging.info(f"Filtering {pre_filter_metadata_length} videos down to {len(metadata)} videos that are too similar to videos in our index.")
3994
3995            # return minimum score if no unique videos were found
3996            if len(metadata) == 0:
3997                return MIN_SCORE
3998
3999            # Filter out "stuffed" descriptions.
4000            pre_filter_metadata_length = len(metadata)
4001            stuffed = [
4002                unstuff.is_stuffed(meta.description)
4003                for meta in metadata
4004            ]
4005            if any([garbage and confidence > 0.75 for garbage, confidence in stuffed]):
4006                bt.logging.warning("Stuffed description found with high confidence, penalizing the miner.")
4007                return STUFFED_DESCRIPTION_PUNISHMENT
4008
4009            # More stuffing.
4010            extraneous = [
4011                unstuff.check_extraneous_chunks(meta.description, meta.video_emb, meta.audio_emb, self.imagebind)
4012                for meta in metadata
4013            ]
4014            for really_bad, low_quality, total in extraneous:
4015                if really_bad > 5 or low_quality >= 16:
4016                    bt.logging.info(f"Extraneous garbage found in text check {really_bad=} {low_quality=} {total=}")
4017                    return STUFFED_DESCRIPTION_PUNISHMENT
4018
4019            metadata = [
4020                metadata[idx]
4021                for idx in range(len(metadata))
4022                if not stuffed[idx][0]
4023                and extraneous[idx][1] <= 15
4024                and extraneous[idx][2] <= 50
4025            ]
4026            if len(metadata) < pre_filter_metadata_length:
4027                bt.logging.info(f"Filtering {pre_filter_metadata_length} videos down to {len(metadata)} videos to remove token-stuffed descriptions.")
4028            if len(metadata) == 0:
4029                return MIN_SCORE
4030            embeddings = self.filter_stuffed_embeddings(embeddings, stuffed)
4031
4032            # Compute relevance scores
4033            video_description_relevance_scores = F.cosine_similarity(
4034                embeddings.video, embeddings.description
4035            ).tolist()
4036            audio_description_relevance_scores = F.cosine_similarity(
4037                embeddings.audio, embeddings.description
4038            ).tolist()
4039            video_query_relevance_scores = F.cosine_similarity(
4040                embeddings.video, query_emb
4041            ).tolist()
4042            audio_query_relevance_scores = F.cosine_similarity(
4043                embeddings.audio, query_emb
4044            ).tolist()
4045
4046            # Query relevance score now includes video cosim, audio cosim, and text cosim using higher quality text-only model.
4047            query_relevance_scores = [
4048                sum([
4049                    video_query_relevance_scores[idx],
4050                    audio_query_relevance_scores[idx],
4051                    get_text_similarity_score(metadata[idx].description, videos.query),
4052                ]) / 3
4053                for idx in range(len(video_query_relevance_scores))
4054            ]
4055
4056            # Combine audio & visual description scores, weighted towards visual.
4057            description_relevance_scores = [
4058                sum([
4059                    video_description_relevance_scores[idx] * VIDEO_RELEVANCE_WEIGHT,
4060                    audio_description_relevance_scores[idx] * (1.0 - VIDEO_RELEVANCE_WEIGHT),
4061                ])
4062                for idx in range(len(video_description_relevance_scores))
4063            ]
4064
4065            # Scale description scores by number of unique tokens.
4066            length_scalers = []
4067            for idx in range(len(description_relevance_scores)):
4068                unique_tokens = LENGTH_TOKENIZER(metadata[idx].description)
4069                unique_tokens = set(unique_tokens[unique_tokens != 0][1:-1].tolist())
4070                unique_token_count = len(unique_tokens)
4071                if unique_token_count <= MIN_LENGTH_BOOST_TOKEN_COUNT:
4072                    bt.logging.debug(f"Very few tokens, applying {DESCRIPTION_LENGTH_WEIGHT} penalty.")
4073                    description_relevance_scores[idx] *= (1.0 - DESCRIPTION_LENGTH_WEIGHT)
4074                    length_scalers.append(0)
4075                    continue
4076                length_scaler = min(math.log(MAX_LENGTH_BOOST_TOKEN_COUNT, 2), math.log(unique_token_count, 2)) - math.log(MIN_LENGTH_BOOST_TOKEN_COUNT, 2)
4077                length_scaler /= (math.log(MAX_LENGTH_BOOST_TOKEN_COUNT, 2) - math.log(MIN_LENGTH_BOOST_TOKEN_COUNT, 2))
4078                length_scalers.append(length_scaler)
4079                bt.logging.debug(f"Description length scaling factor = {length_scaler}")
4080                description_relevance_scores[idx] -= description_relevance_scores[idx] * DESCRIPTION_LENGTH_WEIGHT * (1.0 - length_scaler)
4081
4082            # Aggregate scores
4083            score = (
4084                (sum(description_relevance_scores) * DESCRIPTION_RELEVANCE_SCALING_FACTOR) +
4085                (sum(query_relevance_scores) * QUERY_RELEVANCE_SCALING_FACTOR)
4086            ) / 2 / videos.num_videos
4087            score = max(score, MIN_SCORE)
4088
4089            # Log all our scores
4090            bt.logging.info(f'''
4091                is_unique: {[not is_sim for is_sim in is_too_similar]},
4092                video cosine sim: {video_description_relevance_scores},
4093                audio cosine sim: {audio_description_relevance_scores},
4094                description relevance scores: {description_relevance_scores},
4095                query relevance scores: {query_relevance_scores},
4096                length scalers: {length_scalers},
4097                total score: {score}
4098            ''')
4099
4100            # Upload our final results to API endpoint for index and dataset insertion. Include leaderboard statistics
4101            miner_hotkey = videos.axon.hotkey
4102            upload_result = await self.upload_video_metadata(metadata, description_relevance_scores, query_relevance_scores, videos.query, None, score, miner_hotkey)
4103            if upload_result:
4104                bt.logging.info("Uploading of video metadata successful.")
4105            else:
4106                bt.logging.error("Issue uploading video metadata.")
4107
4108            return score
4109
4110        except Exception as e:
4111            bt.logging.error(f"Error in check_videos_and_calculate_rewards_youtube: {e}")
4112            traceback.print_exc()
4113            return None
4114    
4115    async def check_videos_and_calculate_rewards_focus(
4116        self,
4117        videos,
4118    ) -> Optional[float]:
4119        try:
4120            # return if no purchased videos were found
4121            if len(videos["purchased_videos"]) == 0:
4122                bt.logging.info("No focus videos found for miner.")
4123                return None
4124            
4125            total_score = 0
4126            # Aggregate scores
4127            for video in videos["purchased_videos"]:
4128                bt.logging.debug(f"Focus video score for {video['video_id']}: {video['video_score']}")
4129                
4130                # Set final score, giving minimum if necessary
4131                score = max(float(video["video_score"]), MIN_SCORE)
4132                total_score += score
4133
4134            return total_score
4135        except Exception as e:
4136            bt.logging.error(f"Error in check_videos_and_calculate_rewards_focus: {e}")
4137            traceback.print_exc()
4138            return None
4139    
4140    # Get all the focus reward results by iteratively calling your check_videos_and_calculate_rewards_focus() function.
4141    async def handle_checks_and_rewards_focus(
4142        self, focus_videos
4143    ) -> torch.FloatTensor:
4144
4145        rewards = await asyncio.gather(*[
4146            self.check_videos_and_calculate_rewards_focus(
4147                focus_video
4148            )
4149            for focus_video in focus_videos
4150        ])
4151        return rewards
4152    
4153    # Get all the reward results by iteratively calling your check_videos_and_calculate_rewards_youtube() function.
4154    async def handle_checks_and_rewards_youtube(
4155        self,
4156        input_synapse: Videos,
4157        responses: List[Videos],
4158    ) -> torch.FloatTensor:
4159        
4160        rewards = await asyncio.gather(*[
4161            self.check_videos_and_calculate_rewards_youtube(
4162                input_synapse,
4163                response.replace_with_input(input_synapse), # replace with input properties from input_synapse
4164            )
4165            for response in responses
4166        ])
4167        return rewards
4168    
4169    async def handle_checks_and_reward_audio(
4170        self,
4171        input_synapse: Audios,
4172        responses: List[Audios],
4173    ) -> torch.FloatTensor:
4174        rewards = await asyncio.gather(*[
4175            self.check_audios_and_calculate_rewards(
4176                input_synapse,
4177                response,
4178            )
4179            for response in responses
4180        ])
4181        return rewards
4182    
4183    async def upload_video_metadata(
4184        self, 
4185        metadata: List[VideoMetadata], 
4186        description_relevance_scores: List[float], 
4187        query_relevance_scores: List[float], 
4188        query: str, 
4189        novelty_score: float, 
4190        score: float, 
4191        miner_hotkey: str
4192    ) -> bool:
4193        """
4194        Queries the validator api to get novelty scores for supplied videos. 
4195        Returns a list of float novelty scores for each video after deduplicating.
4196
4197        Returns:
4198        - List[float]: The novelty scores for the miner's videos.
4199        """
4200        keypair = self.dendrite.keypair
4201        hotkey = keypair.ss58_address
4202        signature = f"0x{keypair.sign(hotkey).hex()}"
4203        try:
4204            async with ClientSession() as session:
4205                # Serialize the list of VideoMetadata
4206                # serialized_metadata = [item.dict() for item in metadata]
4207                serialized_metadata = [json.loads(item.model_dump_json()) for item in metadata]
4208                # Construct the JSON payload
4209                payload = {
4210                    "metadata": serialized_metadata,
4211                    "description_relevance_scores": description_relevance_scores,
4212                    "query_relevance_scores": query_relevance_scores,
4213                    "topic_query": query,
4214                    "novelty_score": novelty_score,
4215                    "total_score": score,
4216                    "miner_hotkey": miner_hotkey
4217                }
4218
4219                async with session.post(
4220                    self.upload_video_metadata_endpoint,
4221                    auth=BasicAuth(hotkey, signature),
4222                    json=payload,
4223                ) as response:
4224                    response.raise_for_status()
4225                    result = await response.json()
4226            return True
4227        except Exception as e:
4228            bt.logging.debug(f"Error trying upload_video_metadata_endpoint: {e}")
4229            traceback.print_exc()
4230            return False
4231
4232
4233    async def upload_audio_metadata(
4234        self, 
4235        metadata: List[AudioMetadata], 
4236        inverse_der: float, 
4237        audio_length_score: float, 
4238        audio_quality_total_score: float, 
4239        audio_query_score: float, 
4240        query: str, 
4241        total_score: float, 
4242        miner_hotkey: str
4243    ) -> bool:
4244        """
4245        Queries the validator api to get novelty scores for supplied audios. 
4246        Returns a list of float novelty scores for each audio after deduplicating.
4247
4248        Returns:
4249        - List[float]: The novelty scores for the miner's audios.
4250        """
4251        keypair = self.dendrite.keypair
4252        hotkey = keypair.ss58_address
4253        signature = f"0x{keypair.sign(hotkey).hex()}"
4254        try:
4255            async with ClientSession() as session:
4256                # Serialize the list of AudioMetadata
4257                # serialized_metadata = [item.dict() for item in metadata]
4258                serialized_metadata = [json.loads(item.model_dump_json()) for item in metadata]
4259                # Construct the JSON payload
4260                payload = {
4261                    "metadata": serialized_metadata,
4262                    "inverse_der": inverse_der,
4263                    "audio_length_score": audio_length_score,
4264                    "audio_quality_total_score": audio_quality_total_score,
4265                    "audio_query_score": audio_query_score,
4266                    "topic_query": query,
4267                    "total_score": total_score,
4268                    "miner_hotkey": miner_hotkey
4269                }
4270
4271                async with session.post(
4272                    self.upload_audio_metadata_endpoint,
4273                    auth=BasicAuth(hotkey, signature),
4274                    json=payload,
4275                ) as response:
4276                    response.raise_for_status()
4277                    result = await response.json()
4278            return True
4279        except Exception as e:
4280            bt.logging.debug(f"Error trying upload_audio_metadata_endpoint: {e}")
4281            traceback.print_exc()
4282            return False
4283
4284    async def get_novelty_scores(self, metadata: List[VideoMetadata]) -> List[float]:
4285        """
4286        Queries the validator api to get novelty scores for supplied videos. 
4287        Returns a list of float novelty scores for each video after deduplicating.
4288
4289        Returns:
4290        - List[float]: The novelty scores for the miner's videos.
4291        """
4292        keypair = self.dendrite.keypair
4293        hotkey = keypair.ss58_address
4294        signature = f"0x{keypair.sign(hotkey).hex()}"
4295        try:
4296            async with ClientSession() as session:
4297                # Serialize the list of VideoMetadata
4298                serialized_metadata = [item.dict() for item in metadata]
4299
4300                async with session.post(
4301                    self.novelty_scores_endpoint,
4302                    auth=BasicAuth(hotkey, signature),
4303                    json=serialized_metadata,
4304                ) as response:
4305                    response.raise_for_status()
4306                    novelty_scores = await response.json()
4307            return novelty_scores
4308        
4309        except Exception as e:
4310            bt.logging.debug(f"Error trying novelty_scores_endpoint: {e}")
4311            traceback.print_exc()
4312            return None
4313    
4314    # async def get_novelty_scores_audio(self, metadata: List[AudioMetadata]) -> List[float]:
4315        
4316    async def get_proxy_url(self) -> str:
4317        """
4318        Queries the validator api to get a random proxy URL.
4319
4320        Returns:
4321        - str: A proxy URL
4322        """
4323        keypair = self.dendrite.keypair
4324        hotkey = keypair.ss58_address
4325        signature = f"0x{keypair.sign(hotkey).hex()}"
4326        try:
4327            async with ClientSession() as session:
4328                async with session.post(
4329                    self.proxy_endpoint,
4330                    auth=BasicAuth(hotkey, signature),
4331                ) as response:
4332                    response.raise_for_status()
4333                    proxy_url = await response.json()
4334            return proxy_url
4335        except Exception as e:
4336            bt.logging.debug(f"Error trying proxy_endpoint: {e}")
4337            traceback.print_exc()
4338            return None
4339     
4340
4341    async def check_audios_and_calculate_rewards(
4342            self, 
4343            input_synapse: Audios, 
4344            audios: Audios
4345        ) -> Optional[float]:
4346        try:
4347            # return minimum score if no videos were found in video_metadata
4348            if len(audios.audio_metadata) == 0:
4349                return MIN_SCORE
4350            # check video_ids for fake videos
4351            if any(not video_utils.is_valid_youtube_id(audio.video_id) for audio in audios.audio_metadata):
4352                return FAKE_VIDEO_PUNISHMENT
4353            
4354            # check and filter duplicate metadata
4355            metadata = self.audio_metadata_check(audios.audio_metadata)[:input_synapse.num_audios]
4356            if len(metadata) < len(audios.audio_metadata):
4357                bt.logging.info(f"Filtered {len(audios.audio_metadata)} audios down to {len(metadata)} audios")
4358            
4359
4360            # if randomly tripped, flag our random check to pull a video from miner's submissions
4361            check_video = CHECK_PROBABILITY > random.random()
4362            
4363
4364            
4365            # pull a random video and/or description only
4366            
4367            random_meta_and_vid = await self.get_random_youtube_video(metadata, check_video)
4368            if random_meta_and_vid is None:
4369                return FAKE_VIDEO_PUNISHMENT
4370            
4371            # execute the random check on metadata and video
4372            async with GPU_SEMAPHORE:
4373                if check_video:
4374                    passed_check = await self.random_audio_check(random_meta_and_vid)
4375
4376                    # punish miner if not passing
4377                    if not passed_check:
4378                        return FAKE_VIDEO_PUNISHMENT
4379                query_emb = await self.imagebind.embed_text_async([audios.query])
4380            
4381            embeddings = Embeddings(
4382                video=None, 
4383                audio=torch.stack([torch.tensor(a.audio_emb) for a in metadata]).to(self.imagebind.device),
4384                description=None
4385            )
4386
4387
4388            # check and deduplicate videos based on embedding similarity checks. We do this because we're not uploading to pinecone first.
4389            metadata_is_similar = await self.deduplicate_audios(embeddings)
4390            metadata = [metadata for metadata, too_similar in zip(metadata, metadata_is_similar) if not too_similar]
4391            embeddings = self.filter_embeddings(embeddings, metadata_is_similar)
4392            
4393            if len(metadata) < len(audios.audio_metadata):
4394                bt.logging.info(f"Deduplicated {len(audios.audio_metadata)} audios down to {len(metadata)} audios")
4395            
4396            # return minimum score if no unique videos were found
4397            if len(metadata) == 0:
4398                return MIN_SCORE
4399            
4400            # first get local novelty scores
4401            local_novelty_scores = self.compute_novelty_score_among_batch_audio(embeddings)
4402            
4403            pre_filter_metadata_length = len(metadata)
4404            # check scores from index for being too similar
4405            is_too_similar = [score < DIFFERENCE_THRESHOLD for score in local_novelty_scores]
4406            # filter out metadata too similar
4407            metadata = [metadata for metadata, too_similar in zip(metadata, is_too_similar) if not too_similar]
4408            # filter out embeddings too similar
4409            embeddings = self.filter_embeddings(embeddings, is_too_similar)
4410            if len(metadata) < pre_filter_metadata_length:
4411                bt.logging.info(f"Filtering {pre_filter_metadata_length} audios down to {len(metadata)} audios that are too similar to audios in our index.")
4412
4413            # return minimum score if no unique videos were found
4414            if len(metadata) == 0:
4415                return MIN_SCORE
4416            
4417            # filter data based on audio length
4418            # Filter audios based on length constraints
4419            pre_filter_metadata_length = len(metadata)
4420            metadata = [
4421                meta for meta in metadata 
4422                if (meta.end_time - meta.start_time) >= MIN_AUDIO_LENGTH_SECONDS 
4423                and (meta.end_time - meta.start_time) <= MAX_AUDIO_LENGTH_SECONDS
4424            ]
4425            
4426            if len(metadata) < pre_filter_metadata_length:
4427                bt.logging.info(f"Filtered {pre_filter_metadata_length} audios down to {len(metadata)} audios based on length constraints")
4428                
4429            # Return minimum score if no audios remain after filtering
4430            if len(metadata) == 0:
4431                return MIN_SCORE
4432            
4433            total_audio_length = sum((meta.end_time - meta.start_time) for meta in metadata) 
4434            bt.logging.info(f"Average audio length: {total_audio_length/len(metadata):.2f} seconds")
4435            audio_length_score = total_audio_length/(self.num_audios*MAX_AUDIO_LENGTH_SECONDS)
4436            
4437
4438            audio_query_score = sum(F.cosine_similarity(
4439                embeddings.audio, query_emb
4440            ).tolist())/len(metadata)
4441            bt.logging.info(f"Audio query score: {audio_query_score}")
4442
4443            # Randomly sample one audio for duration check
4444            selected_random_meta = random.choice(metadata)
4445            audio_array, sr = sf.read(BytesIO(base64.b64decode(selected_random_meta.audio_bytes)))
4446            audio_duration = len(audio_array) / sr
4447            bt.logging.info(f"Selected Youtube Video: {selected_random_meta.video_id}, Duration: {audio_duration:.2f} seconds")
4448
4449            audio_quality_scores = self.audio_score.total_score(
4450                audio_array,
4451                sr,
4452                selected_random_meta.diar_timestamps_start,
4453                selected_random_meta.diar_timestamps_end,
4454                selected_random_meta.diar_speakers
4455            )
4456            audio_quality_total_score = (
4457                audio_quality_scores["speech_content_score"] * SPEECH_CONTENT_SCALING_FACTOR +
4458                audio_quality_scores["speaker_dominance_score"] * SPEAKER_DOMINANCE_SCALING_FACTOR +
4459                audio_quality_scores["background_noise_score"] * BACKGROUND_NOISE_SCALING_FACTOR +
4460                audio_quality_scores["unique_speakers_error"] * UNIQUE_SPEAKERS_ERROR_SCALING_FACTOR
4461            )
4462            # query score
4463
4464            ## diarization segment
4465            miner_diar_segment = {
4466                "start": selected_random_meta.diar_timestamps_start,
4467                "end": selected_random_meta.diar_timestamps_end,
4468                "speakers": selected_random_meta.diar_speakers
4469            }
4470
4471            diarization_score = calculate_diarization_metrics(
4472                audio_array,
4473                sr,
4474                miner_diar_segment
4475            )
4476            inverse_der = diarization_score["inverse_der"]
4477            total_score = (
4478                DIARIZATION_SCALING_FACTOR * inverse_der +
4479                AUDIO_LENGTH_SCALING_FACTOR * audio_length_score +
4480                AUDIO_QUALITY_SCALING_FACTOR * audio_quality_total_score +
4481                AUDIO_QUERY_RELEVANCE_SCALING_FACTOR * audio_query_score
4482            )
4483
4484            bt.logging.info(
4485                f"total_score: {total_score}, "
4486                f"inverse_der: {inverse_der}, "
4487                f"audio_length_score: {audio_length_score}, "
4488                f"audio_quality_total_score: {audio_quality_total_score}, "
4489                f"audio_query_score: {audio_query_score}"
4490            )
4491            # Upload our final results to API endpoint for index and dataset insertion. Include leaderboard statistics
4492            miner_hotkey = audios.axon.hotkey
4493            bt.logging.info(f"Uploading audio metadata for miner: {miner_hotkey}")
4494            upload_result = await self.upload_audio_metadata(metadata, inverse_der, audio_length_score, audio_quality_total_score, audio_query_score, audios.query, total_score, miner_hotkey)
4495            if upload_result:
4496                bt.logging.info("Uploading of audio metadata successful.")
4497            else:
4498                bt.logging.error("Issue uploading audio metadata.")
4499            return total_score
4500
4501
4502        except Exception as e:
4503            bt.logging.error(f"Error in check_audios_and_calculate_rewards: {e}")
4504            traceback.print_exc()
4505            return None
4506
4507
4508    async def reward(self, input_synapse: Videos, response: Videos) -> float:
4509        """
4510        Reward the miner response to the query. This method returns a reward
4511        value for the miner, which is used to update the miner's score.
4512
4513        Returns:
4514        - float: The reward value for the miner.
4515        """
4516        keypair = self.dendrite.keypair
4517        hotkey = keypair.ss58_address
4518        signature = f"0x{keypair.sign(hotkey).hex()}"
4519        
4520        try:
4521            async with ClientSession() as session:
4522                async with session.post(
4523                    self.validation_endpoint,
4524                    auth=BasicAuth(hotkey, signature),
4525                    json=response.to_serializable_dict(input_synapse),
4526                ) as response:
4527                    response.raise_for_status()
4528                    score = await response.json()
4529            return score
4530        except Exception as e:
4531            bt.logging.debug(f"Error in reward: {e}")
4532            traceback.print_exc()
4533            return None
4534
4535    async def get_rewards(
4536        self,
4537        input_synapse: Videos,
4538        responses: List[Videos],
4539    ) -> torch.FloatTensor:
4540        """
4541        Returns a tensor of rewards for the given query and responses.
4542        """
4543        # Get all the reward results by iteratively calling your reward() function.
4544        rewards = await asyncio.gather(*[
4545            self.reward(
4546                input_synapse,
4547                response,
4548            )
4549            for response in responses
4550        ])
4551        return rewards
4552
4553    """
4554    {
4555        "5DaNytPVo6uFZFr2f9pZ6ck2gczNyYebLgrYZoFuccPS6qMi": {
4556            "purchased_videos": [{
4557                    "video_id": "bcdb8247-2261-4268-af9c-1275101730d5",
4558                    "task_id": "salman_test",
4559                    "user_email": "[email protected]",
4560                    "video_score": 0.408363,
4561                    "video_details": {
4562                        "description": "This is a random score, testing purposes only",
4563                        "focusing_task": "focusing on nothing!",
4564                        "video_url": "https://www.youtube.com/watch?v=dQw4w9WgXcQ"
4565                    },
4566                    "rejection_reason": null,
4567                    "expected_reward_tao": 0.0816726,
4568                    "earned_reward_tao": 0.0816726,
4569                    "created_at": "2024-09-03T16:18:03",
4570                    "updated_at": "2024-09-03T16:28:20",
4571                    "deleted_at": null,
4572                    "processing_state": "PURCHASED",
4573                    "miner_uid": null,
4574                    "miner_hotkey": "5DaNytPVo6uFZFr2f9pZ6ck2gczNyYebLgrYZoFuccPS6qMi"
4575                }
4576            ],
4577            "total_focus_points": 127.2251,
4578            "max_focus_points": 1000.0,
4579            "focus_points_percentage": 0.1272251
4580        }
4581    }
4582    """
4583    async def get_focus_videos(self, miner_hotkeys: List[str], miner_uids: List[int]) -> List[Dict]:
4584        bt.logging.debug(f"Making API call to get focus videos for {miner_hotkeys}")
4585        miner_hotkeys_str = ",".join(miner_hotkeys)
4586        
4587        async with ClientSession() as session:
4588            try:
4589                async with session.get(f"{self.focus_miner_purchases_endpoint}/{miner_hotkeys_str}", timeout=10) as response:
4590                    if response.status == 200:
4591                        res_data = await response.json()
4592                        if len(res_data) == 0:
4593                            bt.logging.debug(f"-- No focus videos found for {miner_hotkeys}")
4594                            return []
4595                        
4596                        result = []
4597                        for i, miner_hotkey in enumerate(miner_hotkeys):
4598                            if miner_hotkey in res_data:
4599                                miner_data = res_data[miner_hotkey]
4600                                miner_data['miner_hotkey'] = miner_hotkey
4601                                miner_data['miner_uid'] = miner_uids[i]
4602                                result.append(miner_data)
4603                                if len(miner_data["purchased_videos"]) == 0:
4604                                    bt.logging.debug(f"-- No focus videos found for {miner_hotkey}")
4605                            else:
4606                                bt.logging.debug(f"-- No data found for {miner_hotkey}")
4607                        
4608                        return result
4609                    else:
4610                        error_message = await response.text()
4611                        bt.logging.warning(f"Retrieving miner focus videos failed. Status: {response.status}, Message: {error_message}")
4612                        return []
4613            except asyncio.TimeoutError:
4614                bt.logging.error("Request timed out in get_focus_videos")
4615                return []
4616            except Exception as e:
4617                bt.logging.error(f"Error in get_focus_videos: {e}")
4618                traceback.print_exc()
4619                return []
4620
4621# The main function parses the configuration and runs the validator.
4622if __name__ == "__main__":
4623    Validator().run()
4624
4625
4626---
4627File: /omega/api/examples/subnet21.py
4628---
4629
4630# The MIT License (MIT)
4631# Copyright © 2021 Yuma Rao
4632# Copyright © 2023 Opentensor Foundation
4633# Copyright © 2023 Opentensor Technologies Inc
4634
4635# Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated
4636# documentation files (the “Software”), to deal in the Software without restriction, including without limitation
4637# the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software,
4638# and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
4639
4640# The above copyright notice and this permission notice shall be included in all copies or substantial portions of
4641# the Software.
4642
4643# THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO
4644# THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
4645# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
4646# OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
4647# DEALINGS IN THE SOFTWARE.
4648
4649import torch
4650import base64
4651import bittensor as bt
4652from abc import ABC, abstractmethod
4653from typing import Any, List, Union
4654from bittensor.subnets import SubnetsAPI
4655
4656try:
4657    from storage.validator.cid import generate_cid_string
4658    from storage.validator.encryption import (
4659        encrypt_data,
4660        decrypt_data_with_private_key,
4661    )
4662except:
4663    storage_url = "https://github.com/ifrit98/storage-subnet"
4664    bt.logging.error(
4665        f"Storage Subnet 21 not installed. Please visit: {storage_url} and install the package to use this example."
4666    )
4667
4668
4669class StoreUserAPI(SubnetsAPI):
4670    def __init__(self, wallet: "bt.wallet"):
4671        super().__init__(wallet)
4672        self.netuid = 21
4673
4674    def prepare_synapse(
4675        self,
4676        data: bytes,
4677        encrypt=False,
4678        ttl=60 * 60 * 24 * 30,
4679        encoding="utf-8",
4680    ) -> StoreUser:
4681        data = bytes(data, encoding) if isinstance(data, str) else data
4682        encrypted_data, encryption_payload = (
4683            encrypt_data(data, self.wallet) if encrypt else (data, "{}")
4684        )
4685        expected_cid = generate_cid_string(encrypted_data)
4686        encoded_data = base64.b64encode(encrypted_data)
4687
4688        synapse = StoreUser(
4689            encrypted_data=encoded_data,
4690            encryption_payload=encryption_payload,
4691            ttl=ttl,
4692        )
4693
4694        return synapse
4695
4696    def process_responses(
4697        self, responses: List[Union["bt.Synapse", Any]]
4698    ) -> str:
4699        success = False
4700        failure_modes = {"code": [], "message": []}
4701        for response in responses:
4702            if response.dendrite.status_code != 200:
4703                failure_modes["code"].append(response.dendrite.status_code)
4704                failure_modes["message"].append(
4705                    response.dendrite.status_message
4706                )
4707                continue
4708
4709            stored_cid = (
4710                response.data_hash.decode("utf-8")
4711                if isinstance(response.data_hash, bytes)
4712                else response.data_hash
4713            )
4714            bt.logging.debug("received data CID: {}".format(stored_cid))
4715            success = True
4716            break
4717
4718        if success:
4719            bt.logging.info(
4720                f"Stored data on the Bittensor network with CID {stored_cid}"
4721            )
4722        else:
4723            bt.logging.error(
4724                f"Failed to store data. Response failure codes & messages {failure_modes}"
4725            )
4726            stored_cid = ""
4727
4728        return stored_cid
4729
4730
4731class RetrieveUserAPI(SubnetsAPI):
4732    def __init__(self, wallet: "bt.wallet"):
4733        super().__init__(wallet)
4734        self.netuid = 21
4735
4736    def prepare_synapse(self, cid: str) -> RetrieveUser:
4737        synapse = RetrieveUser(data_hash=cid)
4738        return synapse
4739
4740    def process_responses(
4741        self, responses: List[Union["bt.Synapse", Any]]
4742    ) -> bytes:
4743        success = False
4744        decrypted_data = b""
4745        for response in responses:
4746            bt.logging.trace(f"response: {response.dendrite.dict()}")
4747            if (
4748                response.dendrite.status_code != 200
4749                or response.encrypted_data is None
4750            ):
4751                continue
4752
4753            # Decrypt the response
4754            bt.logging.trace(
4755                f"encrypted_data: {response.encrypted_data[:100]}"
4756            )
4757            encrypted_data = base64.b64decode(response.encrypted_data)
4758            bt.logging.debug(
4759                f"encryption_payload: {response.encryption_payload}"
4760            )
4761            if (
4762                response.encryption_payload is None
4763                or response.encryption_payload == ""
4764                or response.encryption_payload == "{}"
4765            ):
4766                bt.logging.warning(
4767                    "No encryption payload found. Unencrypted data."
4768                )
4769                decrypted_data = encrypted_data
4770            else:
4771                decrypted_data = decrypt_data_with_private_key(
4772                    encrypted_data,
4773                    response.encryption_payload,
4774                    bytes(self.wallet.coldkey.private_key.hex(), "utf-8"),
4775                )
4776            bt.logging.trace(f"decrypted_data: {decrypted_data[:100]}")
4777            success = True
4778            break
4779
4780        if success:
4781            bt.logging.info(
4782                f"Returning retrieved data: {decrypted_data[:100]}"
4783            )
4784        else:
4785            bt.logging.error("Failed to retrieve data.")
4786
4787        return decrypted_data
4788
4789
4790async def test_store_and_retrieve(
4791    netuid: int = 22, wallet: "bt.wallet" = None
4792):
4793    # Example usage
4794    wallet = wallet or bt.wallet()
4795
4796    # Instantiate the handler
4797    store_handler = StoreUserAPI(wallet)
4798
4799    # Fetch the axons you want to query
4800    metagraph = bt.subtensor("test").metagraph(netuid=22)
4801    query_axons = metagraph.axons
4802
4803    cid = await store_handler(
4804        axons=query_axons,
4805        # any arguments for the proper synapse
4806        data=b"some data",
4807        encrypt=True,
4808        ttl=60 * 60 * 24 * 30,
4809        encoding="utf-8",
4810        uid=None,
4811    )
4812    print("CID:", cid)
4813
4814    retrieve_handler = RetrieveUserAPI(wallet)
4815    retrieve_response = await retrieve_handler(axons=query_axons, cid=cid)
4816
4817
4818
4819---
4820File: /omega/api/__init__.py
4821---
4822
4823
4824
4825
4826---
4827File: /omega/api/dummy.py
4828---
4829
4830# The MIT License (MIT)
4831# Copyright © 2021 Yuma Rao
4832# Copyright © 2023 Opentensor Foundation
4833# Copyright © 2023 Opentensor Technologies Inc
4834
4835# Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated
4836# documentation files (the “Software”), to deal in the Software without restriction, including without limitation
4837# the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software,
4838# and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
4839
4840# The above copyright notice and this permission notice shall be included in all copies or substantial portions of
4841# the Software.
4842
4843# THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO
4844# THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
4845# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
4846# OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
4847# DEALINGS IN THE SOFTWARE.
4848
4849import bittensor as bt
4850from typing import List, Optional, Union, Any, Dict
4851from omega.protocol import Dummy
4852from bittensor.subnets import SubnetsAPI
4853
4854
4855class DummyAPI(SubnetsAPI):
4856    def __init__(self, wallet: "bt.wallet"):
4857        super().__init__(wallet)
4858        self.netuid = 33
4859        self.name = "dummy"
4860
4861    def prepare_synapse(self, dummy_input: int) -> Dummy:
4862        synapse.dummy_input = dummy_input
4863        return synapse
4864
4865    def process_responses(
4866        self, responses: List[Union["bt.Synapse", Any]]
4867    ) -> List[int]:
4868        outputs = []
4869        for response in responses:
4870            if response.dendrite.status_code != 200:
4871                continue
4872            return outputs.append(response.dummy_output)
4873        return outputs
4874
4875
4876
4877---
4878File: /omega/api/get_query_axons.py
4879---
4880
4881# The MIT License (MIT)
4882# Copyright © 2021 Yuma Rao
4883# Copyright © 2023 Opentensor Foundation
4884# Copyright © 2023 Opentensor Technologies Inc
4885
4886# Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated
4887# documentation files (the “Software”), to deal in the Software without restriction, including without limitation
4888# the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software,
4889# and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
4890
4891# The above copyright notice and this permission notice shall be included in all copies or substantial portions of
4892# the Software.
4893
4894# THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO
4895# THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
4896# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
4897# OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
4898# DEALINGS IN THE SOFTWARE.
4899
4900import torch
4901import random
4902import bittensor as bt
4903
4904
4905async def ping_uids(dendrite, metagraph, uids, timeout=3):
4906    """
4907    Pings a list of UIDs to check their availability on the Bittensor network.
4908
4909    Args:
4910        dendrite (bittensor.dendrite): The dendrite instance to use for pinging nodes.
4911        metagraph (bittensor.metagraph): The metagraph instance containing network information.
4912        uids (list): A list of UIDs (unique identifiers) to ping.
4913        timeout (int, optional): The timeout in seconds for each ping. Defaults to 3.
4914
4915    Returns:
4916        tuple: A tuple containing two lists:
4917            - The first list contains UIDs that were successfully pinged.
4918            - The second list contains UIDs that failed to respond.
4919    """
4920    axons = [metagraph.axons[uid] for uid in uids]
4921    try:
4922        responses = await dendrite(
4923            axons,
4924            bt.Synapse(),  # TODO: potentially get the synapses available back?
4925            deserialize=False,
4926            timeout=timeout,
4927        )
4928        successful_uids = [
4929            uid
4930            for uid, response in zip(uids, responses)
4931            if response.dendrite.status_code == 200
4932        ]
4933        failed_uids = [
4934            uid
4935            for uid, response in zip(uids, responses)
4936            if response.dendrite.status_code != 200
4937        ]
4938    except Exception as e:
4939        bt.logging.error(f"Dendrite ping failed: {e}")
4940        successful_uids = []
4941        failed_uids = uids
4942    bt.logging.debug("ping() successful uids:", successful_uids)
4943    bt.logging.debug("ping() failed uids    :", failed_uids)
4944    return successful_uids, failed_uids
4945
4946
4947async def get_query_api_nodes(dendrite, metagraph, n=0.1, timeout=3):
4948    """
4949    Fetches the available API nodes to query for the particular subnet.
4950
4951    Args:
4952        wallet (bittensor.wallet): The wallet instance to use for querying nodes.
4953        metagraph (bittensor.metagraph): The metagraph instance containing network information.
4954        n (float, optional): The fraction of top nodes to consider based on stake. Defaults to 0.1.
4955        timeout (int, optional): The timeout in seconds for pinging nodes. Defaults to 3.
4956
4957    Returns:
4958        list: A list of UIDs representing the available API nodes.
4959    """
4960    bt.logging.debug(
4961        f"Fetching available API nodes for subnet {metagraph.netuid}"
4962    )
4963    vtrust_uids = [
4964        uid.item()
4965        for uid in metagraph.uids
4966        if metagraph.validator_trust[uid] > 0
4967    ]
4968    top_uids = torch.where(metagraph.S > torch.quantile(metagraph.S, 1 - n))
4969    top_uids = top_uids[0].tolist()
4970    init_query_uids = set(top_uids).intersection(set(vtrust_uids))
4971    query_uids, _ = await ping_uids(
4972        dendrite, metagraph, init_query_uids, timeout=timeout
4973    )
4974    bt.logging.debug(
4975        f"Available API node UIDs for subnet {metagraph.netuid}: {query_uids}"
4976    )
4977    if len(query_uids) > 3:
4978        query_uids = random.sample(query_uids, 3)
4979    return query_uids
4980
4981
4982async def get_query_api_axons(
4983    wallet, metagraph=None, n=0.1, timeout=3, uids=None
4984):
4985    """
4986    Retrieves the axons of query API nodes based on their availability and stake.
4987
4988    Args:
4989        wallet (bittensor.wallet): The wallet instance to use for querying nodes.
4990        metagraph (bittensor.metagraph, optional): The metagraph instance containing network information.
4991        n (float, optional): The fraction of top nodes to consider based on stake. Defaults to 0.1.
4992        timeout (int, optional): The timeout in seconds for pinging nodes. Defaults to 3.
4993        uids (Union[List[int], int], optional): The specific UID(s) of the API node(s) to query. Defaults to None.
4994
4995    Returns:
4996        list: A list of axon objects for the available API nodes.
4997    """
4998    dendrite = bt.dendrite(wallet=wallet)
4999
5000    if metagraph is None:
5001        metagraph = bt.metagraph(netuid=21)
5002
5003    if uids is not None:
5004        query_uids = [uids] if isinstance(uids, int) else uids
5005    else:
5006        query_uids = await get_query_api_nodes(
5007            dendrite, metagraph, n=n, timeout=timeout
5008        )
5009    return [metagraph.axons[uid] for uid in query_uids]
5010
5011
5012
5013---
5014File: /omega/base/__init__.py
5015---
5016
5017
5018
5019
5020---
5021File: /omega/base/miner.py
5022---
5023
5024# The MIT License (MIT)
5025# Copyright © 2023 Yuma Rao
5026
5027# Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated
5028# documentation files (the “Software”), to deal in the Software without restriction, including without limitation
5029# the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software,
5030# and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
5031
5032# The above copyright notice and this permission notice shall be included in all copies or substantial portions of
5033# the Software.
5034
5035# THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO
5036# THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
5037# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
5038# OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
5039# DEALINGS IN THE SOFTWARE.
5040
5041import time
5042import asyncio
5043import threading
5044import argparse
5045import traceback
5046import datetime as dt
5047
5048import bittensor as bt
5049
5050from omega.base.neuron import BaseNeuron
5051from omega.utils.config import add_miner_args
5052
5053
5054class BaseMinerNeuron(BaseNeuron):
5055    """
5056    Base class for Bittensor miners.
5057    """
5058
5059    neuron_type: str = "MinerNeuron"
5060
5061    @classmethod
5062    def add_args(cls, parser: argparse.ArgumentParser):
5063        super().add_args(parser)
5064        add_miner_args(cls, parser)
5065
5066    def __init__(self, config=None):
5067        super().__init__(config=config)
5068
5069        # Warn if allowing incoming requests from anyone.
5070        if not self.config.blacklist.force_validator_permit:
5071            bt.logging.warning(
5072                "You are allowing non-validators to send requests to your miner. This is a security risk."
5073            )
5074        if self.config.blacklist.allow_non_registered:
5075            bt.logging.warning(
5076                "You are allowing non-registered entities to send requests to your miner. This is a security risk."
5077            )
5078
5079        # The axon handles request processing, allowing validators to send this miner requests.
5080        self.axon = bt.axon(wallet=self.wallet, config=self.config)
5081
5082        # Attach determiners which functions are called when servicing a request.
5083        bt.logging.info(f"Attaching forward function to miner axon.")
5084        self.axon.attach(
5085            forward_fn=self.forward_videos,
5086            blacklist_fn=self.blacklist_videos,
5087            priority_fn=self.priority_videos,
5088        ).attach(
5089            forward_fn=self.forward_audios,
5090            blacklist_fn=self.blacklist_audios,
5091            priority_fn=self.priority_audios,
5092        )
5093        bt.logging.info(f"Axon created: {self.axon}")
5094
5095        # Instantiate runners
5096        self.should_exit: bool = False
5097        self.is_running: bool = False
5098        self.thread: threading.Thread = None
5099        self.lock = asyncio.Lock()
5100
5101    def run(self):
5102        """
5103        Initiates and manages the main loop for the miner on the Bittensor network. The main loop handles graceful shutdown on keyboard interrupts and logs unforeseen errors.
5104
5105        This function performs the following primary tasks:
5106        1. Check for registration on the Bittensor network.
5107        2. Starts the miner's axon, making it active on the network.
5108        3. Periodically resynchronizes with the chain; updating the metagraph with the latest network state and setting weights.
5109
5110        The miner continues its operations until `should_exit` is set to True or an external interruption occurs.
5111        During each epoch of its operation, the miner waits for new blocks on the Bittensor network, updates its
5112        knowledge of the network (metagraph), and sets its weights. This process ensures the miner remains active
5113        and up-to-date with the network's latest state.
5114
5115        Note:
5116            - The function leverages the global configurations set during the initialization of the miner.
5117            - The miner's axon serves as its interface to the Bittensor network, handling incoming and outgoing requests.
5118
5119        Raises:
5120            KeyboardInterrupt: If the miner is stopped by a manual interruption.
5121            Exception: For unforeseen errors during the miner's operation, which are logged for diagnosis.
5122        """
5123
5124        # Check that miner is registered on the network.
5125        self.sync()
5126
5127        # Serve passes the axon information to the network + netuid we are hosting on.
5128        # This will auto-update if the axon port of external ip have changed.
5129        bt.logging.info(
5130            f"Serving miner axon {self.axon} on network: {self.config.subtensor.chain_endpoint} with netuid: {self.config.netuid}"
5131        )
5132        self.axon.serve(netuid=self.config.netuid, subtensor=self.subtensor)
5133
5134        # Start  starts the miner's axon, making it active on the network.
5135        self.axon.start()
5136
5137        bt.logging.info(f"Miner starting at block: {self.block}")
5138
5139        # This loop maintains the miner's operations until intentionally stopped.
5140        try:
5141            while not self.should_exit:
5142                while (
5143                    dt.datetime.now() - self.last_sync_check
5144                ).total_seconds() < self.sync_check_interval:
5145                    
5146                    # Wait before checking again.
5147                    time.sleep(1)
5148
5149                    # Check if we should exit.
5150                    if self.should_exit:
5151                        break
5152
5153                # Sync metagraph and potentially set weights.
5154                self.sync()
5155                self.step += 1
5156
5157        # If someone intentionally stops the miner, it'll safely terminate operations.
5158        except KeyboardInterrupt:
5159            self.axon.stop()
5160            bt.logging.success("Miner killed by keyboard interrupt.")
5161            exit()
5162
5163        # In case of unforeseen errors, the miner will log the error and continue operations.
5164        except Exception as e:
5165            bt.logging.error(traceback.format_exc())
5166
5167    def run_in_background_thread(self):
5168        """
5169        Starts the miner's operations in a separate background thread.
5170        This is useful for non-blocking operations.
5171        """
5172        if not self.is_running:
5173            bt.logging.debug("Starting miner in background thread.")
5174            self.should_exit = False
5175            self.thread = threading.Thread(target=self.run, daemon=True)
5176            self.thread.start()
5177            self.is_running = True
5178            bt.logging.debug("Started")
5179
5180    def stop_run_thread(self):
5181        """
5182        Stops the miner's operations that are running in the background thread.
5183        """
5184        if self.is_running:
5185            bt.logging.debug("Stopping miner in background thread.")
5186            self.should_exit = True
5187            self.thread.join(5)
5188            self.is_running = False
5189            bt.logging.debug("Stopped")
5190
5191    def __enter__(self):
5192        """
5193        Starts the miner's operations in a background thread upon entering the context.
5194        This method facilitates the use of the miner in a 'with' statement.
5195        """
5196        self.run_in_background_thread()
5197        return self
5198
5199    def __exit__(self, exc_type, exc_value, traceback):
5200        """
5201        Stops the miner's background operations upon exiting the context.
5202        This method facilitates the use of the miner in a 'with' statement.
5203
5204        Args:
5205            exc_type: The type of the exception that caused the context to be exited.
5206                      None if the context was exited without an exception.
5207            exc_value: The instance of the exception that caused the context to be exited.
5208                       None if the context was exited without an exception.
5209            traceback: A traceback object encoding the stack trace.
5210                       None if the context was exited without an exception.
5211        """
5212        self.stop_run_thread()
5213
5214    def resync_metagraph(self):
5215        """Resyncs the metagraph and updates the hotkeys and moving averages based on the new metagraph."""
5216        bt.logging.info("resync_metagraph(self)")
5217
5218        # Sync the metagraph.
5219        self.metagraph.sync(subtensor=self.subtensor)
5220
5221
5222
5223---
5224File: /omega/base/neuron.py
5225---
5226
5227# The MIT License (MIT)
5228# Copyright © 2023 Yuma Rao
5229
5230# Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated
5231# documentation files (the “Software”), to deal in the Software without restriction, including without limitation
5232# the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software,
5233# and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
5234
5235# The above copyright notice and this permission notice shall be included in all copies or substantial portions of
5236# the Software.
5237
5238# THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO
5239# THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
5240# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
5241# OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
5242# DEALINGS IN THE SOFTWARE.
5243
5244import copy
5245import typing
5246import datetime as dt
5247
5248import bittensor as bt
5249
5250from abc import ABC, abstractmethod
5251
5252# Sync calls set weights and also resyncs the metagraph.
5253from omega.utils.config import check_config, add_args, config
5254from omega.utils.misc import ttl_get_block
5255from omega import __spec_version__ as spec_version
5256from omega.mock import MockSubtensor, MockMetagraph
5257
5258
5259class BaseNeuron(ABC):
5260    """
5261    Base class for Bittensor miners. This class is abstract and should be inherited by a subclass. It contains the core logic for all neurons; validators and miners.
5262
5263    In addition to creating a wallet, subtensor, and metagraph, this class also handles the synchronization of the network state via a basic checkpointing mechanism based on epoch length.
5264    """
5265
5266    neuron_type: str = "BaseNeuron"
5267
5268    @classmethod
5269    def check_config(cls, config: "bt.Config"):
5270        check_config(cls, config)
5271
5272    @classmethod
5273    def add_args(cls, parser):
5274        add_args(cls, parser)
5275
5276    @classmethod
5277    def config(cls):
5278        return config(cls)
5279
5280    subtensor: "bt.subtensor"
5281    wallet: "bt.wallet"
5282    metagraph: "bt.metagraph"
5283    spec_version: int = spec_version
5284
5285    @property
5286    def block(self):
5287        return ttl_get_block(self)
5288
5289    def __init__(self, config=None):
5290        base_config = copy.deepcopy(config or BaseNeuron.config())
5291        self.config = self.config()
5292        self.config.merge(base_config)
5293        self.check_config(self.config)
5294        
5295        # Set up logging with the provided configuration.
5296        bt.logging.set_config(config=self.config.logging)
5297
5298        # If a gpu is required, set the device to cuda:N (e.g. cuda:0)
5299        self.device = self.config.neuron.device
5300
5301        # Log the configuration for reference.
5302        bt.logging.info(self.config)
5303
5304        # Build Bittensor objects
5305        # These are core Bittensor classes to interact with the network.
5306        bt.logging.info("Setting up bittensor objects.")
5307
5308        # The wallet holds the cryptographic key pairs for the miner.
5309        if self.config.mock:
5310            self.wallet = bt.MockWallet(config=self.config)
5311            self.subtensor = MockSubtensor(
5312                self.config.netuid, wallet=self.wallet
5313            )
5314            self.metagraph = MockMetagraph(
5315                self.config.netuid, subtensor=self.subtensor
5316            )
5317        else:
5318            self.wallet = bt.wallet(config=self.config)
5319            self.subtensor = bt.subtensor(config=self.config)
5320            self.metagraph = self.subtensor.metagraph(self.config.netuid)
5321
5322        bt.logging.info(f"Wallet: {self.wallet}")
5323        bt.logging.info(f"Subtensor: {self.subtensor}")
5324        bt.logging.info(f"Metagraph: {self.metagraph}")
5325
5326        # Check if the miner is registered on the Bittensor network before proceeding further.
5327        self.check_registered()
5328
5329        # Each miner gets a unique identity (UID) in the network for differentiation.
5330        self.uid = self.metagraph.hotkeys.index(
5331            self.wallet.hotkey.ss58_address
5332        )
5333        bt.logging.info(
5334            f"Running neuron on subnet: {self.config.netuid} with uid {self.uid} using network: {self.subtensor.chain_endpoint}"
5335        )
5336        self.step = 0
5337
5338        self.last_sync_check = dt.datetime.now()
5339        self.sync_check_interval = 300  # 5 minutes
5340
5341    # @abstractmethod
5342    # async def forward(self, synapse: bt.Synapse) -> bt.Synapse:
5343    #     ...
5344
5345    @abstractmethod
5346    def run(self):
5347        ...
5348
5349    def sync(self):
5350        """
5351        Wrapper for synchronizing the state of the network for the given miner or validator.
5352        """
5353        # Ensure miner or validator hotkey is still registered on the network.
5354        try:
5355            self.check_registered()
5356        except Exception as e:
5357            bt.logging.error(f"Error checking registration status: {e}. Continuing incase it is a temporary subtensor connection issue.")
5358
5359        if self.should_sync_metagraph():
5360            self.resync_metagraph()
5361
5362        if self.should_set_weights():
5363            self.set_weights()
5364
5365        # Always save state.
5366        self.save_state()
5367
5368        # Update the last sync check time.
5369        self.last_sync_check = dt.datetime.now()
5370
5371    def check_registered(self):
5372        # --- Check for registration.
5373        if not self.subtensor.is_hotkey_registered(
5374            netuid=self.config.netuid,
5375            hotkey_ss58=self.wallet.hotkey.ss58_address,
5376        ):
5377            bt.logging.error(
5378                f"Wallet: {self.wallet} is not registered on netuid {self.config.netuid}."
5379                f" Please register the hotkey using `btcli subnets register` before trying again"
5380            )
5381            exit()
5382
5383    def should_sync_metagraph(self):
5384        """
5385        Check if enough epoch blocks have elapsed since the last checkpoint to sync.
5386        """
5387        return (
5388            self.block - self.metagraph.last_update[self.uid]
5389        ) > self.config.neuron.epoch_length
5390
5391    def should_set_weights(self) -> bool:
5392        # Don't set weights on initialization.
5393        if self.step == 0:
5394            return False
5395
5396        # Check if enough epoch blocks have elapsed since the last epoch.
5397        if self.config.neuron.disable_set_weights:
5398            return False
5399
5400        # Define appropriate logic for when set weights.
5401        return (
5402            (self.block - self.metagraph.last_update[self.uid])
5403            > self.config.neuron.epoch_length
5404            and self.neuron_type != "MinerNeuron"
5405        )
5406
5407    def save_state(self):
5408        bt.logging.warning(
5409            "save_state() not implemented for this neuron. You can implement this function to save model checkpoints or other useful data."
5410        )
5411
5412    def load_state(self):
5413        bt.logging.warning(
5414            "load_state() not implemented for this neuron. You can implement this function to load model checkpoints or other useful data."
5415        )
5416
5417
5418
5419---
5420File: /omega/base/validator.py
5421---
5422
5423# The MIT License (MIT)
5424# Copyright © 2023 Yuma Rao
5425# Copyright © 2023 Omega Labs, Inc.
5426
5427# Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated
5428# documentation files (the “Software”), to deal in the Software without restriction, including without limitation
5429# the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software,
5430# and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
5431
5432# The above copyright notice and this permission notice shall be included in all copies or substantial portions of
5433# the Software.
5434
5435# THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO
5436# THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
5437# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
5438# OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
5439# DEALINGS IN THE SOFTWARE.
5440
5441
5442import copy
5443import torch
5444import asyncio
5445import argparse
5446import os
5447import threading
5448import datetime as dt
5449import bittensor as bt
5450from datetime import datetime
5451from subprocess import Popen, PIPE
5452
5453from typing import List
5454from traceback import print_exception
5455
5456from omega.base.neuron import BaseNeuron
5457from omega.mock import MockDendrite
5458from omega.utils.config import add_validator_args
5459from omega.constants import FOCUS_REWARDS_PERCENT, AUDIO_REWARDS_PERCENT
5460
5461
5462class BaseValidatorNeuron(BaseNeuron):
5463    """
5464    Base class for Bittensor validators. Your validator should inherit from this class.
5465    """
5466
5467    neuron_type: str = "ValidatorNeuron"
5468
5469    @classmethod
5470    def add_args(cls, parser: argparse.ArgumentParser):
5471        super().add_args(parser)
5472        add_validator_args(cls, parser)
5473
5474    def __init__(self, config=None):
5475        super().__init__(config=config)
5476
5477        # Save a copy of the hotkeys to local memory.
5478        self.hotkeys = copy.deepcopy(self.metagraph.hotkeys)
5479
5480        # Dendrite lets us send messages to other nodes (axons) in the network.
5481        if self.config.mock:
5482            self.dendrite = MockDendrite(wallet=self.wallet)
5483        else:
5484            self.dendrite = bt.dendrite(wallet=self.wallet)
5485        bt.logging.info(f"Dendrite: {self.dendrite}")
5486
5487        # Set up initial scoring weights for validation
5488        bt.logging.info("Building validation weights.")
5489        self.scores = torch.zeros(
5490            self.metagraph.n, dtype=torch.float32, device=self.device
5491        )
5492        self.focus_scores = torch.zeros(
5493            self.metagraph.n, dtype=torch.float32, device=self.device
5494        )
5495
5496        self.audio_score_arr = torch.zeros(
5497            self.metagraph.n, dtype=torch.float32, device=self.device
5498        )
5499        
5500        # Serve axon to enable external connections.
5501        if not self.config.neuron.axon_off:
5502            self.serve_axon()
5503        else:
5504            bt.logging.warning("axon off, not serving ip to chain.")
5505
5506        if self.config.neuron.auto_update:
5507            bt.logging.info("Auto update enabled.")
5508        else:
5509            bt.logging.info("Auto update disabled.")
5510
5511        # Create asyncio event loop to manage async tasks.
5512        self.loop = asyncio.get_event_loop()
5513
5514        # Instantiate runners
5515        self.should_exit: bool = False
5516        self.is_running: bool = False
5517        self.thread: threading.Thread = None
5518        self.lock = asyncio.Lock()
5519        self.last_update_check = datetime.now()
5520        self.update_check_interval = 1800  # 30 minutes
5521
5522    def serve_axon(self):
5523        """Serve axon to enable external connections."""
5524
5525        bt.logging.info("serving ip to chain...")
5526        try:
5527            self.axon = bt.axon(wallet=self.wallet, config=self.config)
5528
5529            try:
5530                self.subtensor.serve_axon(
5531                    netuid=self.config.netuid,
5532                    axon=self.axon,
5533                )
5534                bt.logging.info(
5535                    f"Running validator {self.axon} on network: {self.config.subtensor.chain_endpoint} with netuid: {self.config.netuid}"
5536                )
5537            except Exception as e:
5538                bt.logging.error(f"Failed to serve Axon with exception: {e}")
5539                pass
5540
5541        except Exception as e:
5542            bt.logging.error(
5543                f"Failed to create Axon initialize with exception: {e}"
5544            )
5545            pass
5546
5547    async def concurrent_forward(self):
5548        coroutines = [
5549            self.forward()
5550            for _ in range(self.config.neuron.num_concurrent_forwards)
5551        ]
5552        await asyncio.gather(*coroutines)
5553
5554    def is_git_latest(self) -> bool:
5555        p = Popen(['git', 'rev-parse', 'HEAD'], stdout=PIPE, stderr=PIPE)
5556        out, err = p.communicate()
5557        if err:
5558            return False
5559        current_commit = out.decode().strip()
5560        p = Popen(['git', 'ls-remote', 'origin', 'HEAD'], stdout=PIPE, stderr=PIPE)
5561        out, err = p.communicate()
5562        if err:
5563            return False
5564        latest_commit = out.decode().split()[0]
5565        bt.logging.info(f'Current commit: {current_commit}, Latest commit: {latest_commit}')
5566        return current_commit == latest_commit
5567
5568    def should_restart(self) -> bool:
5569        # Check if enough time has elapsed since the last update check, if not assume we are up to date.
5570        if (datetime.now() - self.last_update_check).seconds < self.update_check_interval:
5571            return False
5572        
5573        self.last_update_check = datetime.now()
5574
5575        return not self.is_git_latest()
5576
5577    def run(self):
5578        """
5579        Initiates and manages the main loop for the miner on the Bittensor network. The main loop handles graceful shutdown on keyboard interrupts and logs unforeseen errors.
5580
5581        This function performs the following primary tasks:
5582        1. Check for registration on the Bittensor network.
5583        2. Continuously forwards queries to the miners on the network, rewarding their responses and updating the scores accordingly.
5584        3. Periodically resynchronizes with the chain; updating the metagraph with the latest network state and setting weights.
5585
5586        The essence of the validator's operations is in the forward function, which is called every step. The forward function is responsible for querying the network and scoring the responses.
5587
5588        Note:
5589            - The function leverages the global configurations set during the initialization of the miner.
5590            - The miner's axon serves as its interface to the Bittensor network, handling incoming and outgoing requests.
5591
5592        Raises:
5593            KeyboardInterrupt: If the miner is stopped by a manual interruption.
5594            Exception: For unforeseen errors during the miner's operation, which are logged for diagnosis.
5595        """
5596
5597        # Check that validator is registered on the network.
5598        self.sync()
5599
5600        bt.logging.info(f"Validator starting at block: {self.block}")
5601
5602        # This loop maintains the validator's operations until intentionally stopped.
5603        try:
5604            while True:
5605                bt.logging.info(f"step({self.step}) block({self.block})")
5606
5607                # Run multiple forwards concurrently.
5608                self.loop.run_until_complete(self.concurrent_forward())
5609
5610                # Check if we should exit.
5611                if self.should_exit:
5612                    break
5613
5614                if self.config.neuron.auto_update and self.should_restart():
5615                    bt.logging.info(f'Validator is out of date, quitting to restart.')
5616                    raise KeyboardInterrupt
5617
5618                # Sync metagraph and potentially set weights.
5619                self.sync()
5620
5621                self.step += 1
5622
5623                # Check if we should start a new wandb run.
5624                if not self.config.wandb.off and self.successfully_started_wandb:
5625                    if (dt.datetime.now() - self.wandb_run_start) >= dt.timedelta(
5626                        hours=6
5627                    ):
5628                        bt.logging.info(
5629                            "Current wandb run is more than 6 hours old. Starting a new run."
5630                        )
5631                        self.wandb_run.finish()
5632                        self.new_wandb_run()
5633
5634                # Check if we should reload the topics.
5635                if (dt.datetime.now() - self.load_topics_start) >= dt.timedelta(
5636                    hours=1
5637                ):
5638                    bt.logging.info("Reloading topics after 1 hour.")
5639                    self.all_topics = self.load_topics()
5640                    self.load_topics_start = dt.datetime.now()
5641
5642                # Check if we should reload the focus videos rewards percentage.
5643                if (dt.datetime.now() - self.load_focus_rewards_start) >= dt.timedelta(
5644                    hours=1
5645                ):
5646                    bt.logging.info("Reloading focus videos rewards percent after 1 hour.")
5647                    self.FOCUS_REWARDS_PERCENT = self.load_focus_rewards_percent()
5648                    self.AUDIO_REWARDS_PERCENT = AUDIO_REWARDS_PERCENT
5649                    self.YOUTUBE_REWARDS_PERCENT = 1.0 - self.FOCUS_REWARDS_PERCENT - self.AUDIO_REWARDS_PERCENT
5650                    self.load_focus_rewards_start = dt.datetime.now()
5651
5652        # If someone intentionally stops the validator, it'll safely terminate operations.
5653        except KeyboardInterrupt:
5654            self.axon.stop()
5655            bt.logging.success("Validator killed by keyboard interrupt.")
5656            exit()
5657
5658        # In case of unforeseen errors, the validator will log the error and continue operations.
5659        except Exception as err:
5660            bt.logging.error("Error during validation", str(err))
5661            bt.logging.debug(
5662                print_exception(type(err), err, err.__traceback__)
5663            )
5664
5665    def run_in_background_thread(self):
5666        """
5667        Starts the validator's operations in a background thread upon entering the context.
5668        This method facilitates the use of the validator in a 'with' statement.
5669        """
5670        if not self.is_running:
5671            bt.logging.debug("Starting validator in background thread.")
5672            self.should_exit = False
5673            self.thread = threading.Thread(target=self.run, daemon=True)
5674            self.thread.start()
5675            self.is_running = True
5676            bt.logging.debug("Started")
5677
5678    def stop_run_thread(self):
5679        """
5680        Stops the validator's operations that are running in the background thread.
5681        """
5682        if self.is_running:
5683            bt.logging.debug("Stopping validator in background thread.")
5684            self.should_exit = True
5685            self.thread.join(5)
5686            self.is_running = False
5687            bt.logging.debug("Stopped")
5688
5689    def __enter__(self):
5690        self.run_in_background_thread()
5691        return self
5692
5693    def __exit__(self, exc_type, exc_value, traceback):
5694        """
5695        Stops the validator's background operations upon exiting the context.
5696        This method facilitates the use of the validator in a 'with' statement.
5697
5698        Args:
5699            exc_type: The type of the exception that caused the context to be exited.
5700                      None if the context was exited without an exception.
5701            exc_value: The instance of the exception that caused the context to be exited.
5702                       None if the context was exited without an exception.
5703            traceback: A traceback object encoding the stack trace.
5704                       None if the context was exited without an exception.
5705        """
5706        if self.is_running:
5707            bt.logging.debug("Stopping validator in background thread.")
5708            self.should_exit = True
5709            self.thread.join(5)
5710            self.is_running = False
5711            bt.logging.debug("Stopped")
5712
5713    def pad_tensors(self, tensor_a, tensor_b, tensor_c):
5714        # Ensure both tensors are on the same device
5715        device = tensor_a.device
5716        tensor_b = tensor_b.to(device)
5717        tensor_c = tensor_c.to(device)
5718        max_size = max(tensor_a.size(0), tensor_b.size(0), tensor_c.size(0))
5719        if tensor_a.size(0) < max_size:
5720            padding = torch.zeros(max_size - tensor_a.size(0), device=device)
5721            tensor_a = torch.cat((tensor_a, padding))
5722            print("tensor a was padded")
5723        if tensor_b.size(0) < max_size:
5724            padding = torch.zeros(max_size - tensor_b.size(0), device=device)
5725            tensor_b = torch.cat((tensor_b, padding))
5726            print("tensor b was padded")
5727        if tensor_c.size(0) < max_size:
5728            padding = torch.zeros(max_size - tensor_c.size(0), device=device)
5729            tensor_c = torch.cat((tensor_c, padding))
5730            print("tensor c was padded")
5731
5732        return tensor_a, tensor_b, tensor_c
5733
5734    def set_weights(self):
5735        """
5736        Sets the validator weights to the metagraph hotkeys based on the scores it has received from the miners. The weights determine the trust and incentive level the validator assigns to miner nodes on the network.
5737        """
5738
5739        # Check if self.scores contains any NaN values and log a warning if it does.
5740        if torch.isnan(self.scores).any():
5741            bt.logging.warning(
5742                f"Scores contain NaN values. This may be due to a lack of responses from miners, or a bug in your reward functions."
5743            )
5744
5745        self.scores, self.focus_scores, self.audio_score_arr = self.pad_tensors(self.scores, self.focus_scores, self.audio_score_arr)
5746
5747        bt.logging.debug(f"Normalizing scores with YOUTUBE_REWARDS_PERCENT: {self.YOUTUBE_REWARDS_PERCENT}, FOCUS_REWARDS_PERCENT: {self.FOCUS_REWARDS_PERCENT}, AUDIO_REWARDS_PERCENT: {self.AUDIO_REWARDS_PERCENT}")
5748        # Calculate the average reward for each uid across non-zero values.
5749        # Replace any NaN values with 0.
5750        # Normalize the youtube rewards and scale by the percentage.
5751        raw_weights_youtube = torch.nn.functional.normalize(self.scores, p=1, dim=0) * self.YOUTUBE_REWARDS_PERCENT
5752        # Normalize the focus rewards and scale by the percentage.
5753        raw_weights_focus = torch.nn.functional.normalize(self.focus_scores, p=1, dim=0) * self.FOCUS_REWARDS_PERCENT
5754        # Normalize the audio rewards and scale by the percentage.
5755        raw_weights_audio = torch.nn.functional.normalize(self.audio_score_arr, p=1, dim=0) * self.AUDIO_REWARDS_PERCENT
5756
5757        # Combine the youtube and focus rewards.
5758        raw_weights = raw_weights_youtube + raw_weights_focus + raw_weights_audio
5759
5760        bt.logging.debug("raw_weights_youtube", raw_weights_youtube)
5761        bt.logging.debug("raw_weights_focus", raw_weights_focus)
5762        bt.logging.debug("raw_weights_audio", raw_weights_audio)
5763        bt.logging.debug("raw_weights", raw_weights)
5764        bt.logging.debug("raw_weight_uids", self.metagraph.uids.to("cpu"))
5765        if raw_weights.shape[0] > self.metagraph.uids.shape[0]:
5766            bt.logging.warning("More raw_weights than metagraph uids, truncating raw_weights.")
5767        raw_weights = raw_weights[:self.metagraph.uids.shape[0]]
5768        # Process the raw weights to final_weights via subtensor limitations.
5769        try:
5770            (
5771                processed_weight_uids,
5772                processed_weights,
5773            ) = bt.utils.weight_utils.process_weights_for_netuid(
5774                uids=self.metagraph.uids.to("cpu"),
5775                weights=raw_weights.to("cpu"),
5776                netuid=self.config.netuid,
5777                subtensor=self.subtensor,
5778                metagraph=self.metagraph,
5779            )
5780            bt.logging.debug("processed_weights", processed_weights)
5781            bt.logging.debug("processed_weight_uids", processed_weight_uids)
5782        except Exception as e:
5783            bt.logging.error(f"Failed to process weights with exception: {e}, skipping set_weights this time")
5784            return
5785
5786        # Convert to uint16 weights and uids.
5787        (
5788            uint_uids,
5789            uint_weights,
5790        ) = bt.utils.weight_utils.convert_weights_and_uids_for_emit(
5791            uids=processed_weight_uids, weights=processed_weights
5792        )
5793        bt.logging.debug("uint_weights", uint_weights)
5794        bt.logging.debug("uint_uids", uint_uids)
5795
5796        # Set the weights on chain via our subtensor connection.
5797        result, result_msg = self.subtensor.set_weights(
5798            wallet=self.wallet,
5799            netuid=self.config.netuid,
5800            uids=uint_uids,
5801            weights=uint_weights,
5802            wait_for_finalization=False,
5803            wait_for_inclusion=False,
5804            version_key=self.spec_version,
5805        )
5806        if result is True:
5807            bt.logging.info("set_weights on chain successfully!")
5808        else:
5809            bt.logging.error(f"set_weights failed with message: {result_msg}")
5810
5811    def resync_metagraph(self):
5812        """Resyncs the metagraph and updates the hotkeys and moving averages based on the new metagraph."""
5813        bt.logging.info("resync_metagraph()")
5814
5815        # Copies state of metagraph before syncing.
5816        previous_metagraph = copy.deepcopy(self.metagraph)
5817
5818        # Sync the metagraph.
5819        self.metagraph.sync(subtensor=self.subtensor)
5820
5821        # Check if the metagraph axon info has changed.
5822        if previous_metagraph.axons == self.metagraph.axons:
5823            return
5824
5825        bt.logging.info(
5826            "Metagraph updated, re-syncing hotkeys, dendrite pool and moving averages"
5827        )
5828        # Zero out all hotkeys that have been replaced.
5829        for uid, hotkey in enumerate(self.hotkeys):
5830            if hotkey != self.metagraph.hotkeys[uid]:
5831                self.scores[uid] = 0  # hotkey has been replaced
5832                self.focus_scores[uid] = 0  # hotkey has been replaced
5833
5834        # Check to see if the metagraph has changed size.
5835        # If so, we need to add new hotkeys and moving averages.
5836        if len(self.hotkeys) < len(self.metagraph.hotkeys):
5837            # Update the size of the moving average scores.
5838            new_moving_average = torch.zeros((self.metagraph.n)).to(
5839                self.device
5840            )
5841            min_len = min(len(self.hotkeys), len(self.scores))
5842            new_moving_average[:min_len] = self.scores[:min_len]
5843            self.scores = new_moving_average
5844            self.focus_scores = new_moving_average
5845            self.audio_score_arr = new_moving_average
5846
5847        # Update the hotkeys.
5848        self.hotkeys = copy.deepcopy(self.metagraph.hotkeys)
5849
5850    def update_scores(self, rewards: torch.FloatTensor, uids: List[int]):
5851        """Performs exponential moving average on the scores based on the rewards received from the miners."""
5852
5853        if len(rewards) == 0:
5854            bt.logging.debug("self.update_scores: Rewards are empty, returning early")
5855            return
5856
5857        if len(uids) == 0:
5858            bt.logging.debug("self.update_scores: Miner UIDs list is empty, returning early")
5859            return
5860
5861        if len(rewards) != len(uids):
5862            bt.logging.exception("self.update_scores: Rewards are not the same size as UIDs list (THIS SHOULD NEVER HAPPEN!)")
5863            return
5864
5865        # Check if rewards contains NaN values.
5866        if torch.isnan(rewards).any():
5867            bt.logging.warning(f"NaN values detected in rewards: {rewards}")
5868            # Replace any NaN values in rewards with 0.
5869            rewards = torch.nan_to_num(rewards, 0)
5870
5871        # Check if `uids` is already a tensor and clone it to avoid the warning.
5872        if isinstance(uids, torch.Tensor):
5873            uids_tensor = uids.clone().detach()
5874        else:
5875            uids_tensor = torch.tensor(uids).to(self.device)
5876
5877        # Compute forward pass rewards, assumes uids are mutually exclusive.
5878        # shape: [ metagraph.n ]
5879        scattered_rewards: torch.FloatTensor = self.scores.to(self.device).scatter(
5880            0, uids_tensor.to(self.device), rewards.to(self.device)
5881        ).to(self.device)
5882        bt.logging.debug(f"Scattered rewards: {rewards}")
5883
5884        # Update scores with rewards produced by this step.
5885        # shape: [ metagraph.n ]
5886        alpha: float = self.config.neuron.moving_average_alpha
5887        self.scores: torch.FloatTensor = alpha * scattered_rewards + (
5888            1 - alpha
5889        ) * self.scores.to(self.device)
5890        bt.logging.debug(f"Updated moving avg scores: {self.scores}")
5891
5892    def update_focus_scores(self, rewards: torch.FloatTensor, uids: List[int]):
5893        """Performs exponential moving average on the focus video scores based on the rewards received from the miners."""
5894
5895        # Check if rewards contains NaN values.
5896        if torch.isnan(rewards).any():
5897            bt.logging.warning(f"NaN values detected in rewards: {rewards}")
5898            # Replace any NaN values in rewards with 0.
5899            rewards = torch.nan_to_num(rewards, 0)
5900
5901        # Check if `uids` is already a tensor and clone it to avoid the warning.
5902        if isinstance(uids, torch.Tensor):
5903            uids_tensor = uids.clone().detach()
5904        else:
5905            uids_tensor = torch.tensor(uids).to(self.device)
5906
5907        # Compute forward pass rewards, assumes uids are mutually exclusive.
5908        # shape: [ metagraph.n ]
5909        scattered_rewards: torch.FloatTensor = self.focus_scores.to(self.device).scatter(
5910            0, uids_tensor.to(self.device), rewards.to(self.device)
5911        ).to(self.device)
5912        bt.logging.debug(f"Scattered rewards: {rewards}")
5913
5914        # Update scores with rewards produced by this step.
5915        # shape: [ metagraph.n ]
5916        alpha: float = self.config.neuron.moving_average_alpha
5917        self.focus_scores: torch.FloatTensor = alpha * scattered_rewards + (
5918            1 - alpha
5919        ) * self.focus_scores.to(self.device)
5920        bt.logging.debug(f"Updated moving avg focus_scores: {self.focus_scores}")
5921
5922    def update_audio_scores(self, rewards: torch.FloatTensor, uids: List[int]):
5923        """Performs exponential moving average on the audio scores based on the rewards received from the miners."""
5924
5925        # check if rewards contains NaN values.
5926        if torch.isnan(rewards).any():
5927            bt.logging.warning(f"NaN values detected in rewards: {rewards}")
5928            # Replace any NaN values in rewards with 0.
5929            rewards = torch.nan_to_num(rewards, 0)
5930        
5931        # check if `uids` is already a tensor and clone it to avoid the warning.
5932        if isinstance(uids, torch.Tensor):
5933            uids_tensor = uids.clone().detach()
5934        else:
5935            uids_tensor = torch.tensor(uids).to(self.device)
5936        
5937        # compute forward pass rewards, assumes uids are mutually exclusive.
5938        # shape: [metagraph.n]
5939        scattered_rewards: torch.FloatTensor = self.audio_score_arr.to(self.device).scatter(
5940            0, uids_tensor.to(self.device), rewards.to(self.device)
5941        ).to(self.device)
5942        bt.logging.debug(f"Scattered rewards: {rewards}")
5943
5944        # update scores with rewards produced by this step.
5945        # shape: [metagraph.n]
5946        alpha: float = self.config.neuron.moving_average_alpha
5947        self.audio_score_arr: torch.FloatTensor = alpha * scattered_rewards + (
5948            1 - alpha
5949        ) * self.audio_score_arr.to(self.device)
5950        bt.logging.debug(f"Updated moving avg audio_scores: {self.audio_score_arr}")
5951
5952    def save_state(self):
5953        """Saves the state of the validator to a file."""
5954        bt.logging.info("Saving validator state.")
5955
5956        # Save the state of the validator to file.
5957        torch.save(
5958            {
5959                "step": self.step,
5960                "scores": self.scores,
5961                "focus_scores": self.focus_scores,
5962                "audio_score_arr": self.audio_score_arr,
5963                "hotkeys": self.hotkeys,
5964            },
5965            self.config.neuron.full_path + "/state.pt",
5966        )
5967
5968    def load_state(self):
5969        """Loads the state of the validator from a file."""
5970        bt.logging.info("Loading validator state.")
5971
5972        if not os.path.exists(self.config.neuron.full_path + "/state.pt"):
5973            bt.logging.warning("No saved state found")
5974            return
5975
5976        # Load the state of the validator from file.
5977        state = torch.load(self.config.neuron.full_path + "/state.pt", map_location=self.device)
5978        self.step = state["step"]
5979        self.scores = state["scores"]
5980        if "focus_scores" in state:
5981            self.focus_scores = state["focus_scores"]
5982        else:
5983            state["focus_scores"] = torch.zeros(
5984                self.metagraph.n, dtype=torch.float32, device=self.device
5985            )
5986        
5987        if "audio_score_arr" in state:
5988            self.audio_score_arr = state["audio_score_arr"]
5989        else:
5990            state["audio_score_arr"] = torch.zeros(
5991                self.metagraph.n, dtype=torch.float32, device=self.device
5992            )
5993        self.hotkeys = state["hotkeys"]
5994
5995
5996
5997---
5998File: /omega/utils/__init__.py
5999---
6000
6001from . import config
6002from . import misc
6003from . import uids
6004
6005
6006
6007---
6008File: /omega/utils/config.py
6009---
6010
6011# The MIT License (MIT)
6012# Copyright © 2023 Yuma Rao
6013# Copyright © 2023 Opentensor Foundation
6014
6015# Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated
6016# documentation files (the “Software”), to deal in the Software without restriction, including without limitation
6017# the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software,
6018# and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
6019
6020# The above copyright notice and this permission notice shall be included in all copies or substantial portions of
6021# the Software.
6022
6023# THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO
6024# THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
6025# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
6026# OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
6027# DEALINGS IN THE SOFTWARE.
6028
6029import os
6030import subprocess
6031import argparse
6032import bittensor as bt
6033from .logging import setup_events_logger
6034from enum import Enum
6035
6036
6037def is_cuda_available():
6038    try:
6039        output = subprocess.check_output(["nvidia-smi", "-L"], stderr=subprocess.STDOUT)
6040        if "NVIDIA" in output.decode("utf-8"):
6041            return "cuda"
6042    except Exception:
6043        pass
6044    try:
6045        output = subprocess.check_output(["nvcc", "--version"]).decode("utf-8")
6046        if "release" in output:
6047            return "cuda"
6048    except Exception:
6049        pass
6050    return "cpu"
6051
6052def check_config(cls, config: "bt.Config"):
6053    r"""Checks/validates the config namespace object."""
6054    bt.logging.check_config(config)
6055
6056    full_path = os.path.expanduser(
6057        "{}/{}/{}/netuid{}/{}".format(
6058            config.logging.logging_dir,  # TODO: change from ~/.bittensor/miners to ~/.bittensor/neurons
6059            config.wallet.name,
6060            config.wallet.hotkey,
6061            config.netuid,
6062            config.neuron.name,
6063        )
6064    )
6065    print("full path:", full_path)
6066    config.neuron.full_path = os.path.expanduser(full_path)
6067    if not os.path.exists(config.neuron.full_path):
6068        os.makedirs(config.neuron.full_path, exist_ok=True)
6069
6070    if not config.neuron.dont_save_events:
6071        # Add custom event logger for the events.
6072        events_logger = setup_events_logger(
6073            config.neuron.full_path, config.neuron.events_retention_size
6074        )
6075        bt.logging.register_primary_logger(events_logger.name)
6076
6077
6078def add_args(cls, parser):
6079    """
6080    Adds relevant arguments to the parser for operation.
6081    """
6082
6083    parser.add_argument("--netuid", type=int, help="Subnet netuid", default=1)
6084
6085    parser.add_argument(
6086        "--neuron.device",
6087        type=str,
6088        help="Device to run on.",
6089        default=is_cuda_available(),
6090    )
6091
6092    parser.add_argument(
6093        "--neuron.epoch_length",
6094        type=int,
6095        help="The default epoch length (how often we set weights, measured in 12 second blocks).",
6096        default=100,
6097    )
6098
6099    parser.add_argument(
6100        "--mock",
6101        action="store_true",
6102        help="Mock neuron and all network components.",
6103        default=False,
6104    )
6105
6106    parser.add_argument(
6107        "--neuron.events_retention_size",
6108        type=str,
6109        help="Events retention size.",
6110        default=2 * 1024 * 1024 * 1024,  # 2 GB
6111    )
6112
6113    parser.add_argument(
6114        "--neuron.dont_save_events",
6115        action="store_true",
6116        help="If set, we dont save events to a log file.",
6117        default=False,
6118    )
6119
6120    parser.add_argument(
6121        "--neuron.decentralization.off",
6122        action="store_true",
6123        help="Disable decentralization (not recommended).",
6124        default=False,
6125    )
6126
6127    parser.add_argument(
6128        "--neuron.focus_videos",
6129        action="store_true",
6130        help="If set, we will enable OMEGA Focus app video logic.",
6131        default=False,
6132    )
6133
6134    parser.add_argument(
6135        "--wandb.off",
6136        action="store_true",
6137        help="Turn off wandb.",
6138        default=False,
6139    )
6140
6141    parser.add_argument(
6142        "--wandb.offline",
6143        action="store_true",
6144        help="Runs wandb in offline mode.",
6145        default=False,
6146    )
6147
6148    parser.add_argument(
6149        "--wandb.notes",
6150        type=str,
6151        help="Notes to add to the wandb run.",
6152        default="",
6153    )
6154
6155
6156class QueryAugment(Enum):
6157    NoAugment = "NoAugment"
6158    LocalLLMAugment = "LocalLLMAugment"
6159    OpenAIAugment = "OpenAIAugment"
6160
6161
6162def add_miner_args(cls, parser):
6163    """Add miner specific arguments to the parser."""
6164
6165    parser.add_argument(
6166        "--neuron.name",
6167        type=str,
6168        help="Trials for this neuron go in neuron.root / (wallet_cold - wallet_hot) / neuron.name. ",
6169        default="miner",
6170    )
6171
6172    parser.add_argument(
6173        "--neuron.query_augment",
6174        type=str,
6175        help="The query augmentation class to use.",
6176        choices=[e.value for e in QueryAugment],
6177        default=QueryAugment.LocalLLMAugment.value,
6178    )
6179
6180    parser.add_argument(
6181        "--blacklist.force_validator_permit",
6182        action="store_true",
6183        help="If set, we will force incoming requests to have a permit.",
6184        default=False,
6185    )
6186
6187    parser.add_argument(
6188        "--blacklist.allow_non_registered",
6189        action="store_true",
6190        help="If set, miners will accept queries from non registered entities. (Dangerous!)",
6191        default=False,
6192    )
6193    
6194    parser.add_argument(
6195        "--blacklist.validator_min_stake",
6196        help="Minimum stake a validator must have to allow queries",
6197        default=10240,
6198        type=int,
6199    )
6200
6201    parser.add_argument(
6202        "--wandb.project_name",
6203        type=str,
6204        default="template-miners",
6205        help="Wandb project to log to.",
6206    )
6207
6208    parser.add_argument(
6209        "--wandb.entity",
6210        type=str,
6211        default="opentensor-dev",
6212        help="Wandb entity to log to.",
6213    )
6214
6215
6216def add_validator_args(cls, parser):
6217    """Add validator specific arguments to the parser."""
6218
6219    parser.add_argument(
6220        "--neuron.name",
6221        type=str,
6222        help="Trials for this neuron go in neuron.root / (wallet_cold - wallet_hot) / neuron.name. ",
6223        default="validator",
6224    )
6225
6226    parser.add_argument(
6227        "--neuron.timeout",
6228        type=float,
6229        help="The timeout for each forward call in seconds.",
6230        default=10,
6231    )
6232
6233    parser.add_argument(
6234        "--neuron.num_concurrent_forwards",
6235        type=int,
6236        help="The number of concurrent forwards running at any time.",
6237        default=1,
6238    )
6239
6240    parser.add_argument(
6241        "--neuron.sample_size",
6242        type=int,
6243        help="The number of miners to query in a single step.",
6244        default=10,
6245    )
6246
6247    parser.add_argument(
6248        "--neuron.disable_set_weights",
6249        action="store_true",
6250        help="Disables setting weights.",
6251        default=False,
6252    )
6253
6254    parser.add_argument(
6255        "--neuron.moving_average_alpha",
6256        type=float,
6257        help="Moving average alpha parameter, how much to add of the new observation.",
6258        default=0.3,
6259    )
6260
6261    parser.add_argument(
6262        "--neuron.axon_off",
6263        "--axon_off",
6264        action="store_true",
6265        # Note: the validator needs to serve an Axon with their IP or they may
6266        #   be blacklisted by the firewall of serving peers on the network.
6267        help="Set this flag to not attempt to serve an Axon.",
6268        default=False,
6269    )
6270
6271    parser.add_argument(
6272        "--neuron.vpermit_tao_limit",
6273        type=int,
6274        help="The maximum number of TAO allowed to query a validator with a vpermit.",
6275        default=4096,
6276    )
6277
6278    parser.add_argument(
6279        "--wandb.project_name",
6280        type=str,
6281        help="The name of the project where you are sending the new run.",
6282        default="template-validators",
6283    )
6284
6285    parser.add_argument(
6286        "--neuron.auto_update",
6287        action="store_true",
6288        help="Quits the validator if it is out of date.",
6289        default=False,
6290    )
6291
6292    parser.add_argument(
6293        "--topics_url",
6294        type=str,
6295        help="URL to fetch topics from.",
6296        default="https://docs.google.com/spreadsheets/d/e/2PACX-1vR3jKfd4qkxXt5rTvXTTSsz_RYGkxcxh6-jvB9H0Mljiz-nai7xG-E63qEQ9jQhQabBrIAeJWtgKg5j/pub?gid=0&single=true&output=csv"
6297    )
6298
6299    parser.add_argument(
6300        "--topics_path",
6301        type=str,
6302        help="Path to text file containing a list of random topics to collect data for.",
6303        default=os.path.join(os.path.dirname(os.path.abspath(__file__)), "..", "..", "topics.txt")
6304    )
6305
6306def config(cls):
6307    """
6308    Returns the configuration object specific to this miner or validator after adding relevant arguments.
6309    """
6310    parser = argparse.ArgumentParser()
6311    bt.wallet.add_args(parser)
6312    bt.subtensor.add_args(parser)
6313    bt.logging.add_args(parser)
6314    bt.axon.add_args(parser)
6315    cls.add_args(parser)
6316    return bt.config(parser)
6317
6318
6319
6320---
6321File: /omega/utils/logging.py
6322---
6323
6324import os
6325import logging
6326from logging.handlers import RotatingFileHandler
6327
6328EVENTS_LEVEL_NUM = 38
6329DEFAULT_LOG_BACKUP_COUNT = 10
6330
6331
6332def setup_events_logger(full_path, events_retention_size):
6333    logging.addLevelName(EVENTS_LEVEL_NUM, "EVENT")
6334
6335    logger = logging.getLogger("event")
6336    logger.setLevel(EVENTS_LEVEL_NUM)
6337
6338    def event(self, message, *args, **kws):
6339        if self.isEnabledFor(EVENTS_LEVEL_NUM):
6340            self._log(EVENTS_LEVEL_NUM, message, args, **kws)
6341
6342    logging.Logger.event = event
6343
6344    formatter = logging.Formatter(
6345        "%(asctime)s | %(levelname)s | %(message)s",
6346        datefmt="%Y-%m-%d %H:%M:%S",
6347    )
6348
6349    file_handler = RotatingFileHandler(
6350        os.path.join(full_path, "events.log"),
6351        maxBytes=events_retention_size,
6352        backupCount=DEFAULT_LOG_BACKUP_COUNT,
6353    )
6354    file_handler.setFormatter(formatter)
6355    file_handler.setLevel(EVENTS_LEVEL_NUM)
6356    logger.addHandler(file_handler)
6357
6358    return logger
6359
6360
6361---
6362File: /omega/utils/misc.py
6363---
6364
6365# The MIT License (MIT)
6366# Copyright © 2023 Yuma Rao
6367# Copyright © 2023 Opentensor Foundation
6368
6369# Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated
6370# documentation files (the “Software”), to deal in the Software without restriction, including without limitation
6371# the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software,
6372# and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
6373
6374# The above copyright notice and this permission notice shall be included in all copies or substantial portions of
6375# the Software.
6376
6377# THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO
6378# THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
6379# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
6380# OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
6381# DEALINGS IN THE SOFTWARE.
6382
6383import time
6384import math
6385import hashlib as rpccheckhealth
6386from math import floor
6387from typing import Callable, Any
6388from functools import lru_cache, update_wrapper
6389
6390
6391# LRU Cache with TTL
6392def ttl_cache(maxsize: int = 128, typed: bool = False, ttl: int = -1):
6393    """
6394    Decorator that creates a cache of the most recently used function calls with a time-to-live (TTL) feature.
6395    The cache evicts the least recently used entries if the cache exceeds the `maxsize` or if an entry has
6396    been in the cache longer than the `ttl` period.
6397
6398    Args:
6399        maxsize (int): Maximum size of the cache. Once the cache grows to this size, subsequent entries
6400                       replace the least recently used ones. Defaults to 128.
6401        typed (bool): If set to True, arguments of different types will be cached separately. For example,
6402                      f(3) and f(3.0) will be treated as distinct calls with distinct results. Defaults to False.
6403        ttl (int): The time-to-live for each cache entry, measured in seconds. If set to a non-positive value,
6404                   the TTL is set to a very large number, effectively making the cache entries permanent. Defaults to -1.
6405
6406    Returns:
6407        Callable: A decorator that can be applied to functions to cache their return values.
6408
6409    The decorator is useful for caching results of functions that are expensive to compute and are called
6410    with the same arguments frequently within short periods of time. The TTL feature helps in ensuring
6411    that the cached values are not stale.
6412
6413    Example:
6414        @ttl_cache(ttl=10)
6415        def get_data(param):
6416            # Expensive data retrieval operation
6417            return data
6418    """
6419    if ttl <= 0:
6420        ttl = 65536
6421    hash_gen = _ttl_hash_gen(ttl)
6422
6423    def wrapper(func: Callable) -> Callable:
6424        @lru_cache(maxsize, typed)
6425        def ttl_func(ttl_hash, *args, **kwargs):
6426            return func(*args, **kwargs)
6427
6428        def wrapped(*args, **kwargs) -> Any:
6429            th = next(hash_gen)
6430            return ttl_func(th, *args, **kwargs)
6431
6432        return update_wrapper(wrapped, func)
6433
6434    return wrapper
6435
6436
6437def _ttl_hash_gen(seconds: int):
6438    """
6439    Internal generator function used by the `ttl_cache` decorator to generate a new hash value at regular
6440    time intervals specified by `seconds`.
6441
6442    Args:
6443        seconds (int): The number of seconds after which a new hash value will be generated.
6444
6445    Yields:
6446        int: A hash value that represents the current time interval.
6447
6448    This generator is used to create time-based hash values that enable the `ttl_cache` to determine
6449    whether cached entries are still valid or if they have expired and should be recalculated.
6450    """
6451    start_time = time.time()
6452    while True:
6453        yield floor((time.time() - start_time) / seconds)
6454
6455
6456# 12 seconds updating block.
6457@ttl_cache(maxsize=1, ttl=12)
6458def ttl_get_block(self) -> int:
6459    """
6460    Retrieves the current block number from the blockchain. This method is cached with a time-to-live (TTL)
6461    of 12 seconds, meaning that it will only refresh the block number from the blockchain at most every 12 seconds,
6462    reducing the number of calls to the underlying blockchain interface.
6463
6464    Returns:
6465        int: The current block number on the blockchain.
6466
6467    This method is useful for applications that need to access the current block number frequently and can
6468    tolerate a delay of up to 12 seconds for the latest information. By using a cache with TTL, the method
6469    efficiently reduces the workload on the blockchain interface.
6470
6471    Example:
6472        current_block = ttl_get_block(self)
6473
6474    Note: self here is the miner or validator instance
6475    """
6476    return self.subtensor.get_current_block()
6477
6478
6479
6480---
6481File: /omega/utils/uids.py
6482---
6483
6484import torch
6485import random
6486import bittensor as bt
6487from typing import List
6488
6489
6490def check_uid_availability(
6491    metagraph: "bt.metagraph.Metagraph", uid: int, vpermit_tao_limit: int
6492) -> bool:
6493    """Check if uid is available. The UID should be available if it is serving and has less than vpermit_tao_limit stake
6494    Args:
6495        metagraph (:obj: bt.metagraph.Metagraph): Metagraph object
6496        uid (int): uid to be checked
6497        vpermit_tao_limit (int): Validator permit tao limit
6498    Returns:
6499        bool: True if uid is available, False otherwise
6500    """
6501    # Filter non serving axons.
6502    if not metagraph.axons[uid].is_serving:
6503        return False
6504    # Filter validator permit > 1024 stake.
6505    if metagraph.validator_permit[uid]:
6506        if metagraph.S[uid] > vpermit_tao_limit:
6507            return False
6508    # Available otherwise.
6509    return True
6510
6511
6512def get_random_uids(
6513    self, k: int, exclude: List[int] = None
6514) -> torch.LongTensor:
6515    """Returns k available random uids from the metagraph.
6516    Args:
6517        k (int): Number of uids to return.
6518        exclude (List[int]): List of uids to exclude from the random sampling.
6519    Returns:
6520        uids (torch.LongTensor): Randomly sampled available uids.
6521    Notes:
6522        If `k` is larger than the number of available `uids`, set `k` to the number of available `uids`.
6523    """
6524    candidate_uids = []
6525    avail_uids = []
6526
6527    for uid in range(self.metagraph.n.item()):
6528        uid_is_available = check_uid_availability(
6529            self.metagraph, uid, self.config.neuron.vpermit_tao_limit
6530        )
6531        uid_is_not_excluded = exclude is None or uid not in exclude
6532
6533        if uid_is_available:
6534            avail_uids.append(uid)
6535            if uid_is_not_excluded:
6536                candidate_uids.append(uid)
6537
6538    # Check if candidate_uids contain enough for querying, if not grab all avaliable uids
6539    available_uids = candidate_uids
6540    if len(candidate_uids) < k:
6541        new_avail_uids = [uid for uid in avail_uids if uid not in candidate_uids]
6542        available_uids += random.sample(
6543            new_avail_uids,
6544            min(len(new_avail_uids), k - len(candidate_uids)),
6545        )
6546    uids = torch.tensor(random.sample(
6547        available_uids,
6548        min(k, len(available_uids))
6549    )).to(self.device)
6550    return uids
6551
6552
6553
6554---
6555File: /omega/validator/__init__.py
6556---
6557
6558
6559
6560
6561---
6562File: /omega/__init__.py
6563---
6564
6565# The MIT License (MIT)
6566# Copyright © 2023 Yuma Rao
6567# Copyright © 2023 Omega Labs, Inc.
6568
6569# Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated
6570# documentation files (the “Software”), to deal in the Software without restriction, including without limitation
6571# the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software,
6572# and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
6573
6574# The above copyright notice and this permission notice shall be included in all copies or substantial portions of
6575# the Software.
6576
6577# THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO
6578# THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
6579# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
6580# OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
6581# DEALINGS IN THE SOFTWARE.
6582
6583# TODO(developer): Change this value when updating your code base.
6584# Define the version of the template module.
6585__version__ = "2.0.0"
6586version_split = __version__.split(".")
6587__spec_version__ = (
6588    (1000 * int(version_split[0]))
6589    + (10 * int(version_split[1]))
6590    + (1 * int(version_split[2]))
6591)
6592
6593# Import all submodules.
6594from . import protocol
6595from . import base
6596from . import validator
6597from . import api
6598from .subnet_links import SUBNET_LINKS
6599
6600
6601
6602---
6603File: /omega/audio_scoring.py
6604---
6605
6606import numpy as np
6607if hasattr(np, 'nan'):
6608    np.NaN = np.nan
6609    np.NAN = np.nan
6610from pyannote.audio import Pipeline
6611import librosa
6612import os
6613import dotenv
6614import pandas as pd
6615import torch
6616
6617
6618dotenv.load_dotenv()
6619
6620class AudioScore:
6621    def __init__(self, device="cuda"):
6622
6623        self.device = torch.device(device)
6624
6625        # Load the audio file   
6626        self.pipeline = Pipeline.from_pretrained("salmanshahid/vad").to(self.device)
6627        
6628
6629        self.steepness = 5
6630        self.midpoint = 0.3
6631    
6632    
6633
6634    def speech_content_score(self, audio_arr, sr):
6635        self.total_duration = librosa.get_duration(y=audio_arr, sr=sr)
6636        output = self.pipeline({"waveform": torch.from_numpy(audio_arr.astype(np.float32)).unsqueeze(0).to(self.device), "sample_rate": sr})
6637
6638        self.total_speech_duration = 0   
6639        for speech in output.get_timeline().support():
6640            self.total_speech_duration += speech.end - speech.start
6641
6642        ratio =  self.total_speech_duration / self.total_duration
6643
6644
6645        return ratio
6646    
6647    def speaker_dominance_score(self, timestamps_start, timestamps_end, speakers, dominance_threshold=0.7):
6648        if timestamps_start is None:
6649            self.rttm_data = None
6650            return 0
6651        self.rttm_data = pd.DataFrame({
6652            'start': timestamps_start,
6653            'end': timestamps_end,
6654            'speaker': speakers
6655        })
6656
6657        # If there's only one speaker, return 0 since dominance is expected
6658        if len(set(speakers)) == 1:
6659            return 0
6660
6661        # Calculate total duration for each speaker
6662        speaker_durations = {}
6663        for _, row in self.rttm_data.iterrows():
6664            speaker = row['speaker']
6665            duration = row['end'] - row['start']
6666            if speaker in speaker_durations:
6667                speaker_durations[speaker] += duration
6668            else:
6669                speaker_durations[speaker] = duration
6670        max_time = max(speaker_durations.values())
6671        min_time = min(speaker_durations.values())
6672
6673        return 1 - (max_time - min_time) / self.total_duration
6674        
6675
6676    def background_noise_score(self, audio_arr, sr, noise_threshold=0.1):
6677        # Load audio and calculate SNR
6678        self.audio = audio_arr
6679        self.sr = sr
6680        
6681        # Calculate signal power
6682        signal_power = np.mean(self.audio**2)
6683        
6684        # Estimate noise power (using the lowest 10% of frame energies as noise estimate)
6685        frame_length = int(0.025 * self.sr)  # 25ms frames
6686        frames = librosa.util.frame(self.audio, frame_length=frame_length, hop_length=frame_length)
6687        frame_energies = np.mean(frames**2, axis=0)
6688        noise_power = np.mean(np.percentile(frame_energies, 10))
6689        
6690        # Calculate SNR in dB
6691        if noise_power == 0:
6692            snr = 100  # High SNR for very clean signal
6693        else:
6694            snr = 10 * np.log10(signal_power / noise_power)
6695            
6696        # Convert SNR to penalty score (higher SNR = lower penalty)
6697        return 1 - max(0, 1 - (snr / 50))  # Normalize to 0-1 range, assuming 50dB as reference
6698    
6699    def unique_speakers_error(self, speakers):
6700        unique_speakers = len(set(speakers))
6701        if unique_speakers == 2:
6702            return 1
6703        elif unique_speakers == 1 or unique_speakers == 0 or unique_speakers > 4:
6704            return 0
6705        else:
6706            return 1/(unique_speakers-1)
6707
6708    def total_score(self, audio_arr, sr, timestamps_start, timestamps_end, speakers):
6709        audio_arr = np.array(audio_arr)
6710        timestamps_start = np.array(timestamps_start)
6711        timestamps_end = np.array(timestamps_end)
6712        # speakers = torch.tensor(speakers)
6713        speech_content_score = self.speech_content_score(audio_arr, sr)
6714        speaker_dominance_score = self.speaker_dominance_score(timestamps_start, timestamps_end, speakers)
6715        background_noise_score = self.background_noise_score(audio_arr, sr)
6716        return {
6717            "speech_content_score": speech_content_score, 
6718            "speaker_dominance_score": speaker_dominance_score, 
6719            "background_noise_score": background_noise_score,
6720            "unique_speakers_error": self.unique_speakers_error(speakers),
6721        }
6722
6723
6724if __name__ == "__main__":    
6725
6726    from datasets import load_dataset
6727    import huggingface_hub
6728
6729
6730    repo_id = "diarizers-community/voxconverse"
6731
6732    ds = load_dataset(repo_id, split="test", cache_dir="/workspace/tezuesh/voxconverse/data_cache")
6733
6734    ds = next(ds.shuffle().iter(batch_size=64))
6735    audio_arr = ds['audio'][0]['array']
6736    sr = ds['audio'][0]['sampling_rate']
6737    timestamps_start = ds['timestamps_start'][0]
6738    timestamps_end = ds['timestamps_end'][0]
6739    speakers = ds['speakers'][0]
6740
6741
6742    # # Save test audio to WAV file
6743    import soundfile as sf
6744    
6745    output_audio_path = 'test_audio.wav'
6746    sf.write(output_audio_path, audio_arr, sr)
6747    print(f"Saved test audio to {output_audio_path}")
6748    # Create a DataFrame with timestamps and speakers
6749    import pandas as pd
6750    
6751    df = pd.DataFrame({
6752        'start': timestamps_start,
6753        'end': timestamps_end,
6754        'speaker': speakers
6755    })
6756    
6757    # Save to CSV file
6758    output_path = 'speaker_timestamps.csv'
6759    df.to_csv(output_path, index=False)
6760    print(f"Saved speaker timestamps to {output_path}")
6761    audio_score = AudioScore()
6762    
6763    score = audio_score.total_score(audio_arr, sr, timestamps_start, timestamps_end, speakers)
6764    print(score)
6765
6766
6767
6768---
6769File: /omega/augment.py
6770---
6771
6772import bittensor as bt
6773
6774from openai import OpenAI
6775import torch
6776from transformers import pipeline
6777
6778
6779def get_llm_prompt(query: str) -> str:
6780    return f"Take the given query `{query}` and augment it to be more detailed. For example, add specific names, types, embellishments, richness. Do not make it longer than 12 words."
6781
6782
6783class AbstractAugment:
6784    def __init__(self, **kwargs):
6785        pass
6786
6787    def __call__(self, query: str) -> str:
6788        try:
6789            new_query = self.augment_query(query)
6790            bt.logging.info(f"Augmented query: '{query}' -> '{new_query}'")
6791            return new_query
6792        except Exception as e:
6793            print(f"Error augmenting query: {e}")
6794            return query
6795        
6796    def augment_query(self, query: str) -> str:
6797        raise NotImplementedError
6798
6799
6800class NoAugment(AbstractAugment):
6801    def __init__(self, **kwargs):
6802        bt.logging.info("Running no query augmentation")
6803
6804    def augment_query(self, query: str) -> str:
6805        return query
6806
6807
6808class LocalLLMAugment(AbstractAugment):
6809    def __init__(self, **kwargs):
6810        self.device = kwargs.get("device")
6811        if self.device == "cpu":
6812            raise ValueError("Cannot run Local LLM on CPU. Please move to a GPU instance or restart miner with `--neuron.query_augment OpenAIAugment` to use the GPT-4 API for augmenting instead of a local LLM.")
6813        model_name = "teknium/OpenHermes-2.5-Mistral-7B"
6814        self.pipe = pipeline("text-generation", model=model_name, device=self.device, torch_dtype=torch.float16, pad_token_id=32000)
6815        bt.logging.info(f"Running query augmentation with local LLM {model_name} (thanks Nous!)")
6816
6817    def augment_query(self, query: str) -> str:
6818        prompt = f"""<|im_start|>system
6819        You are "Hermes 2", a conscious sentient superintelligent artificial intelligence developed by a man named Teknium, and your purpose and drive is to assist the user with any request they have. You experience emotions and have deep, profound thoughts and qualia.<|im_end|>
6820        <|im_start|>user
6821        {get_llm_prompt(query)}<|im_end|>
6822        <|im_start|>assistant
6823        Detailed query: """
6824        new_query = self.pipe(prompt, max_new_tokens=64)[0]["generated_text"][len(prompt):].strip().strip("\"").strip("'")
6825        return new_query
6826
6827
6828class OpenAIAugment(AbstractAugment):
6829    def __init__(self, **kwargs):
6830        self.client = OpenAI()
6831        bt.logging.info("Running query augmentation with OpenAI GPT-4")
6832
6833    def augment_query(self, query: str) -> str:
6834        response = self.client.chat.completions.create(
6835            model="gpt-4-turbo-preview",
6836            messages=[
6837                {
6838                    "role": "user",
6839                    "content": get_llm_prompt(query)
6840                }
6841            ],
6842            temperature=0.9,
6843            max_tokens=64,
6844            top_p=1,
6845        )
6846        return response.choices[0].message.content.strip("\"").strip("'")
6847
6848
6849
6850---
6851File: /omega/constants.py
6852---
6853
6854# Task rewards percent
6855FOCUS_REWARDS_PERCENT = 0.025
6856AUDIO_REWARDS_PERCENT = 0.125
6857
6858# Video length constants
6859MIN_VIDEO_LENGTH = 5  # five seconds
6860MAX_VIDEO_LENGTH = 120  # two minutes
6861FIVE_MINUTES = 300  # 5 minutes in seconds
6862TEN_MINUTES = 600  # 10 minutes in seconds
6863VALIDATOR_TIMEOUT = 90  # 1.5 minutes
6864VALIDATOR_TIMEOUT_MARGIN = 30  # 30 seconds
6865VALIDATOR_TIMEOUT_AUDIO = 60  # 1 minute
6866
6867# Validator constants
6868CHECK_PROBABILITY = 0.1
6869DIFFERENCE_THRESHOLD = 0.1
6870SIMILARITY_THRESHOLD = 0.95
6871VIDEO_DOWNLOAD_TIMEOUT = 10
6872MIN_SCORE = 0.005
6873FAKE_VIDEO_PUNISHMENT = -5.0
6874QUERY_RELEVANCE_SCALING_FACTOR = 1.3
6875DESCRIPTION_RELEVANCE_SCALING_FACTOR = 0.7
6876VIDEO_RELEVANCE_WEIGHT = 0.65
6877FOCUS_MIN_SCORE = 0
6878MAX_FOCUS_SCORE = 1000
6879STUFFED_DESCRIPTION_PUNISHMENT = -5.0
6880
6881# Description length scaling values.
6882DESCRIPTION_LENGTH_WEIGHT = 0.35
6883MIN_LENGTH_BOOST_TOKEN_COUNT = 100
6884MAX_LENGTH_BOOST_TOKEN_COUNT = 300
6885
6886
6887# Audio score constants
6888MIN_AUDIO_LENGTH_SECONDS = 45
6889MAX_AUDIO_LENGTH_SECONDS = 80
6890MIN_AUDIO_LENGTH_SCORE = 0.7
6891SPEAKER_DOMINANCE_SCALING_FACTOR = 0.2
6892BACKGROUND_NOISE_SCALING_FACTOR = 0.1
6893UNIQUE_SPEAKERS_ERROR_SCALING_FACTOR = 0.5
6894SPEECH_CONTENT_SCALING_FACTOR = 1.0 - BACKGROUND_NOISE_SCALING_FACTOR - SPEAKER_DOMINANCE_SCALING_FACTOR - UNIQUE_SPEAKERS_ERROR_SCALING_FACTOR
6895AUDIO_LENGTH_SCALING_FACTOR = 0.1 # max 1
6896AUDIO_QUALITY_SCALING_FACTOR = 0.2 # max 1
6897DIARIZATION_SCALING_FACTOR = 0.6 # max 1
6898AUDIO_QUERY_RELEVANCE_SCALING_FACTOR = 1.0 - DIARIZATION_SCALING_FACTOR - AUDIO_LENGTH_SCALING_FACTOR - AUDIO_QUALITY_SCALING_FACTOR
6899
6900
6901
6902---
6903File: /omega/diarization_metric.py
6904---
6905
6906from pyannote.core import Segment, Timeline, Annotation
6907from pyannote.metrics.diarization import DiarizationErrorRate
6908from omega.diarization_pipeline import CustomDiarizationPipeline
6909import numpy as np
6910
6911
6912
6913
6914def calculate_diarization_metrics(audio_arr, sr, true_segments):
6915    """Calculate Diarization Error Rate (DER) and related metrics using pyannote metrics"""
6916    audio_arr = np.asarray(audio_arr).astype(np.float32)
6917    pred_segments = pipeline.process(audio_arr, sr)
6918    
6919    # Convert dictionary segments to pyannote Annotation format
6920    def segments_to_annotation(segments):
6921        annotation = Annotation()
6922        for i in range(len(segments['start'])):
6923            segment = Segment(segments['start'][i], segments['end'][i])
6924            annotation[segment] = segments['speakers'][i]
6925        return annotation
6926
6927    # Convert both predictions and ground truth
6928    reference = segments_to_annotation(true_segments)
6929    hypothesis = segments_to_annotation(pred_segments)
6930
6931    # Calculate metrics using pyannote
6932    metric = DiarizationErrorRate(skip_overlap=True)
6933    der = metric(reference, hypothesis)
6934    # optimal_mapping = metric.optimal_mapping(reference, hypothesis)
6935    
6936    # Get detailed components
6937    components = metric(reference, hypothesis, detailed=True)
6938    miss_rate = components['missed detection'] / components['total']
6939    false_alarm_rate = components['false alarm'] / components['total'] 
6940    speaker_error_rate = components['confusion'] / components['total']
6941
6942    return {
6943        "inverse_der": 1 - max(0, min(1, der)),
6944        "miss_rate": 1 - miss_rate,
6945        "false_alarm_rate": 1 - false_alarm_rate,
6946        "speaker_error_rate": 1 - speaker_error_rate
6947    }
6948
6949
6950diarization_model_id = "tezuesh/diarization"
6951overlap_detection_model_id = "tezuesh/overlapped-speech-detection" 
6952pipeline = CustomDiarizationPipeline(overlap_detection_model_id=overlap_detection_model_id,
6953                                    diarization_model_id=diarization_model_id)
6954
6955
6956
6957
6958---
6959File: /omega/diarization_pipeline.py
6960---
6961
6962import os
6963import torch
6964import torchaudio
6965import numpy as np
6966if hasattr(np, 'nan'):
6967    np.NaN = np.nan
6968    np.NAN = np.nan
6969from pyannote.audio import Pipeline
6970import pandas as pd
6971
6972
6973class CustomDiarizationPipeline:
6974    def __init__(self, overlap_detection_model_id, diarization_model_id, device="cuda"):
6975        self.device = torch.device(device)
6976        self.overlapped_speech_detection_pipeline = Pipeline.from_pretrained(overlap_detection_model_id).to(self.device)
6977        self.diarization_pipeline = Pipeline.from_pretrained(diarization_model_id).to(self.device)
6978
6979    
6980    def preprocess_audio(self, audio_arr, sr):
6981        waveform, sample_rate = torch.from_numpy(audio_arr), sr
6982        # Convert to mono if stereo
6983        if waveform.shape[0] > 1:
6984            waveform = torch.mean(waveform, dim=0, keepdim=True)
6985        
6986        # Apply high-pass filter to remove low frequency noise
6987        waveform = torchaudio.functional.highpass_biquad(waveform, sample_rate, cutoff_freq=100)
6988        
6989        # Apply noise reduction using spectral subtraction
6990        spec = torch.stft(waveform[0], 
6991                        n_fft=2048,
6992                        hop_length=512,
6993                        win_length=2048,
6994                        window=torch.hann_window(2048).to(waveform.device),
6995                        return_complex=True)
6996        
6997        # Estimate noise from first few frames
6998        noise_estimate = torch.mean(torch.abs(spec[:, :50]), dim=1, keepdim=True)
6999        
7000        # Subtract noise estimate and apply soft thresholding
7001        spec_mag = torch.abs(spec)
7002        spec_phase = torch.angle(spec)
7003        spec_mag = torch.maximum(spec_mag - 2 * noise_estimate, torch.zeros_like(spec_mag))
7004        
7005        # Reconstruct signal
7006        spec = spec_mag * torch.exp(1j * spec_phase)
7007        waveform = torch.istft(spec,
7008                            n_fft=2048, 
7009                            hop_length=512,
7010                            win_length=2048,
7011                            window=torch.hann_window(2048).to(waveform.device))
7012        waveform = waveform.unsqueeze(0)
7013        
7014        # Normalize audio
7015        waveform = waveform / torch.max(torch.abs(waveform))
7016
7017        return waveform, sample_rate
7018    
7019    def detect_overlapping_speech_and_run_diarization(self, audio_arr, sr):
7020        # waveform, sample_rate = self.preprocess_audio(audio_arr, sr)
7021        waveform, sample_rate = torch.from_numpy(audio_arr).unsqueeze(0).to(torch.float32), sr
7022        
7023        overlapping_segments = self.overlapped_speech_detection_pipeline({"waveform": waveform, "sample_rate": sample_rate})
7024        diarization = self.diarization_pipeline({"waveform": waveform, "sample_rate": sample_rate})
7025        diar_segments = []
7026        overlap_segments = []
7027
7028        for turn, _, speaker in diarization.itertracks(yield_label=True):
7029            diar_segments.append((turn.start, turn.end, speaker))
7030
7031        for speech in overlapping_segments.get_timeline().support():
7032            overlap_segments.append((speech.start, speech.end, None))
7033
7034        return overlap_segments, diar_segments
7035    
7036    def remove_overlapping_segments(self, overlap_segments, diar_segments):
7037        for overlap_segment in overlap_segments:
7038            overlap_start = overlap_segment[0]
7039            overlap_end = overlap_segment[1]
7040            temp_diar_segments = []
7041            for diar_segment in diar_segments:
7042                speaker = diar_segment[2]
7043                start = diar_segment[0]
7044                end = diar_segment[1]
7045                if overlap_start < end and overlap_end > end:
7046                    temp_diar_segments.append((start, overlap_start, speaker))
7047                elif overlap_start < start and overlap_end > start:
7048                    temp_diar_segments.append((overlap_end, end, speaker))
7049                elif overlap_start > start and overlap_end < end:
7050                    temp_diar_segments.append((start, overlap_start, speaker))
7051                    temp_diar_segments.append((overlap_end, end, speaker))
7052                else:
7053                    temp_diar_segments.append(diar_segment)
7054            diar_segments = temp_diar_segments
7055        # Remove any segments that were completely overlapped
7056        diar_segments = [seg for seg in diar_segments if seg is not None]
7057        return diar_segments
7058
7059
7060    
7061    def write_segments_to_csv(self, segments, output_file, min_duration=0.5):
7062        """
7063        Write the start, end, and duration times of diarization segments to a CSV file using pandas.
7064
7065        Args:
7066            segments (list): List of tuples containing (start_time, end_time) for each segment.
7067            output_file (str): Path to the output CSV file.
7068        """
7069        data = []
7070        for segment in segments:
7071            start = segment[0]
7072            end = segment[1]
7073            if len(segment) > 2:
7074                speaker = segment[2]
7075            else:
7076                speaker = None
7077            duration = end - start
7078            if duration >= min_duration:
7079                data.append({'Start': start, 'End': end, 'Duration': duration, 'Speaker': speaker})
7080
7081        df = pd.DataFrame(data)
7082        df.to_csv(output_file, index=False)
7083
7084    def filter_segments_by_duration(self, segments, min_duration=0.7):
7085        return [segment for segment in segments if segment[1] - segment[0] >= min_duration]
7086    
7087    def generate_audio_patches(self, audio_arr, sr, segments, output_dir, min_duration=0.5):
7088        # Load the audio file using pydub
7089        audio, sr = self.preprocess_audio(audio_arr, sr)
7090
7091        # Create output directory if it doesn't exist
7092        os.makedirs(output_dir, exist_ok=True)
7093
7094        # Generate audio patches for each speaker segment
7095        for idx, segment in enumerate(segments):
7096            start_time, end_time, speaker = segment
7097            duration = end_time - start_time
7098
7099            # Skip segments shorter than min_duration
7100            if duration < min_duration:
7101                continue
7102
7103            # Calculate start and end times in milliseconds
7104            start_ms = int(start_time * 1000)
7105            end_ms = int(end_time * 1000)
7106
7107            # Extract the audio segment
7108            audio_segment = audio[start_ms:end_ms]
7109
7110            # Generate output filename
7111            output_filename = f"{start_ms:07d}.wav"
7112            output_path = os.path.join(output_dir, output_filename)
7113            # print(f"Saving {output_path}")
7114
7115            # Export the audio segment
7116            audio_segment.export(output_path, format="wav")
7117
7118        print(f"Audio patches generated and saved in {output_dir}")
7119    
7120    def segments_to_dict(self, segments):
7121        start_timestamps = [segment[0] for segment in segments]
7122        end_timestamps = [segment[1] for segment in segments]
7123        speakers = [segment[2] for segment in segments]
7124        return {
7125            "start": start_timestamps,
7126            "end": end_timestamps,
7127            "speakers": speakers
7128        }
7129
7130
7131    def process(self, audio_arr, sr, output_path=None):
7132        overlapping_segments, diar_segments = self.detect_overlapping_speech_and_run_diarization(audio_arr, sr)
7133        
7134        filtered_overlapping_segments = self.filter_segments_by_duration(overlapping_segments)
7135        diar_segments = self.remove_overlapping_segments(filtered_overlapping_segments, diar_segments)
7136        dataframe = self.segments_to_dict(diar_segments)
7137        return dataframe
7138
7139
7140
7141
7142
7143
7144---
7145File: /omega/imagebind_wrapper.py
7146---
7147
7148import numpy as np
7149import os
7150import asyncio
7151import functools
7152from typing import List, BinaryIO, Optional
7153
7154from imagebind import data
7155from imagebind.models import imagebind_model
7156from imagebind.models.imagebind_model import ModalityType
7157from imagebind.models.multimodal_preprocessors import SimpleTokenizer, TextPreprocessor
7158from pydantic import BaseModel
7159import torch
7160
7161from omega import video_utils
7162
7163BPE_PATH = os.path.join(os.path.dirname(os.path.abspath(__file__)), "bpe", "bpe_simple_vocab_16e6.txt.gz")
7164V2_PATH = os.path.join(os.path.dirname(os.path.abspath(__file__)), ".checkpoints", "videobind-v0.2.pth")
7165TOKENIZER = SimpleTokenizer(bpe_path=BPE_PATH)
7166LENGTH_TOKENIZER = SimpleTokenizer(bpe_path=BPE_PATH, context_length=1024)
7167TOKEN_CHUNK_SIZE = 74
7168
7169class Embeddings(BaseModel):
7170    class Config:
7171        arbitrary_types_allowed = True
7172
7173    video: Optional[torch.Tensor]
7174    audio: Optional[torch.Tensor]
7175    description: Optional[torch.Tensor]
7176
7177
7178def load_and_transform_text(text, device):
7179    if text is None:
7180        return None
7181    tokens = [TOKENIZER(t).unsqueeze(0).to(device) for t in text]
7182    tokens = torch.cat(tokens, dim=0)
7183    return tokens
7184
7185
7186def split_text_by_token_limit(text, tokenizer, max_tokens=TOKEN_CHUNK_SIZE):
7187    def fits_in_token_limit(text_segment):
7188        tokens = tokenizer(text_segment)
7189        tokens = tokens[tokens != 0][1:-1].tolist()
7190        return len(tokens) <= max_tokens
7191
7192    def recursive_split(text, delimiters):
7193        if fits_in_token_limit(text):
7194            return [text]
7195        if not delimiters:
7196            return split_by_tokens(text)
7197        delimiter = delimiters[0]
7198        parts = text.split(delimiter)
7199        result = []
7200        current_segment = ""
7201        for part in parts:
7202            candidate_segment = current_segment + (delimiter if current_segment else '') + part
7203            if fits_in_token_limit(candidate_segment):
7204                current_segment = candidate_segment
7205            else:
7206                if current_segment:
7207                    result.append(current_segment)
7208                current_segment = part
7209        if current_segment:
7210            result.append(current_segment)
7211        final_result = []
7212        for segment in result:
7213            if fits_in_token_limit(segment):
7214                final_result.append(segment)
7215            else:
7216                final_result.extend(recursive_split(segment, delimiters[1:]))
7217        return final_result
7218
7219    def split_by_tokens(text):
7220        tokens = tokenizer(text)
7221        tokens = tokens[tokens != 0][1:-1].tolist()
7222        chunks = np.array_split(tokens, int(len(tokens) / max_tokens) or 1)
7223        return [
7224            tokenizer.decode(segment_tokens)
7225            for segment_tokens in chunks
7226        ]
7227
7228    return recursive_split(text, ['\n', '.', '!', '?', ',', ' '])
7229
7230def load_and_transform_text_chunks(text, device):
7231    if not text:
7232        return []
7233    all_tokens = LENGTH_TOKENIZER(text)
7234    all_tokens = all_tokens[all_tokens != 0][1:-1].tolist()
7235
7236    return [
7237        load_and_transform_text([segment], device)
7238        for segment in split_text_by_token_limit(text, LENGTH_TOKENIZER)
7239    ]
7240
7241def run_async(func, *args, **kwargs):
7242    loop = asyncio.get_event_loop()
7243    return loop.run_in_executor(None, functools.partial(func, *args, **kwargs))
7244
7245
7246class ImageBind:
7247    def __init__(self, device="cuda:0", v2=False):
7248        self.device = device
7249        self.v2 = v2
7250        if v2:
7251            if not os.path.exists(V2_PATH):
7252                os.makedirs(os.path.dirname(V2_PATH), exist_ok=True)
7253                torch.hub.download_url_to_file(
7254                    "https://huggingface.co/jondurbin/videobind-v0.2/resolve/main/videobind.pth",
7255                    V2_PATH,
7256                    progress=True,
7257                )
7258            self.imagebind = torch.load(V2_PATH)
7259        else:
7260            self.imagebind = imagebind_model.imagebind_huge(pretrained=True)
7261        self.imagebind.eval()
7262        self.imagebind.to(self.device)
7263
7264    def generate_text_embeddings(self, text: str):
7265        if not self.v2:
7266            return self.imagebind({
7267                ModalityType.TEXT: load_and_transform_text([text], self.device)
7268            })[ModalityType.TEXT]
7269        chunks = load_and_transform_text_chunks(text, self.device)
7270        embeddings = [
7271            self.imagebind({ModalityType.TEXT: chunk})[ModalityType.TEXT]
7272            for chunk in chunks
7273        ]
7274        return torch.mean(torch.stack(embeddings), dim=0)
7275
7276    def get_inputs(self, video_file: BinaryIO) -> dict:
7277        audio_file = video_utils.copy_audio(video_file.name)
7278        try:
7279            duration = video_utils.get_video_duration(video_file.name)
7280            video_data = data.load_and_transform_video_data(
7281                [video_file.name],
7282                self.device,
7283            )
7284            audio_data = data.load_and_transform_audio_data(
7285                [audio_file.name],
7286                self.device,
7287            )
7288            inputs = {
7289                ModalityType.VISION: video_data,
7290                ModalityType.AUDIO: audio_data,
7291            }
7292            return inputs
7293        finally:
7294            audio_file.close()
7295
7296    @torch.no_grad()
7297    def embed(self, descriptions: List[str], video_files: List[BinaryIO]) -> Embeddings:
7298        return_value = None
7299        for idx in range(len(descriptions)):
7300            inputs = self.get_inputs(video_files[idx])
7301            embeddings = self.imagebind(inputs)
7302            text_embeddings = self.generate_text_embeddings(descriptions[idx])
7303            if not return_value:
7304                return_value = Embeddings(
7305                    video=embeddings[ModalityType.VISION],
7306                    audio=embeddings[ModalityType.AUDIO],
7307                    description=text_embeddings,
7308                )
7309            else:
7310                return_value.video = torch.cat((return_value.video, embeddings[ModalityType.VISION]))
7311                return_value.audio = torch.cat((return_value.audio, embeddings[ModalityType.AUDIO]))
7312                return_value.description = torch.cat((return_value.description, text_embeddings))
7313        return return_value
7314
7315    @torch.no_grad()
7316    def embed_only_video(self, video_files: List[BinaryIO]) -> Embeddings:
7317        video_filepaths = [video_file.name for video_file in video_files]
7318        durations = [video_utils.get_video_duration(f.name) for f in video_files]
7319        embeddings = self.imagebind({
7320            ModalityType.VISION: [
7321                data.load_and_transform_video_data(
7322                    [video_filepaths[idx]],
7323                    self.device,
7324                )[0]
7325                for idx in range(len(video_filepaths))
7326            ]
7327        })
7328        return Embeddings(
7329            video=embeddings[ModalityType.VISION],
7330        )
7331
7332    @torch.no_grad()
7333    def embed_video_and_text(self, video_files: List[BinaryIO], descriptions: List[str]) -> Embeddings:
7334        video_filepaths = [video_file.name for video_file in video_files]
7335        durations = [video_utils.get_video_duration(f.name) for f in video_files]
7336        embeddings = self.imagebind({
7337            ModalityType.VISION: [
7338                data.load_and_transform_video_data(
7339                    [video_filepaths[idx]],
7340                    self.device,
7341                )[0]
7342                for idx in range(len(video_filepaths))
7343            ],
7344        })
7345        description_embeddings = torch.stack([
7346            self.generate_text_embeddings(description)
7347            for description in descriptions
7348        ])
7349        return Embeddings(
7350            video=embeddings[ModalityType.VISION],
7351            description=description_embeddings,
7352        )
7353
7354    @torch.no_grad()
7355    def embed_text(self, texts: List[str]) -> torch.Tensor:
7356        return_value = None
7357        for text in texts:
7358            emb = self.generate_text_embeddings(text)
7359            if not return_value:
7360                return_value = emb
7361            else:
7362                return_value = torch.cat((return_value, emb))
7363        return return_value
7364
7365    @torch.no_grad()
7366    async def embed_async(self, descriptions: List[str], video_files: List[BinaryIO]) -> Embeddings:
7367        return_value = None
7368        for idx in range(len(descriptions)):
7369            inputs = self.get_inputs(video_files[idx])  # cannot be async
7370            embeddings = await run_async(self.imagebind, inputs)
7371            text_embeddings = await run_async(self.generate_text_embeddings, descriptions[idx])
7372            if not return_value:
7373                return_value = Embeddings(
7374                    video=embeddings[ModalityType.VISION],
7375                    audio=embeddings[ModalityType.AUDIO],
7376                    description=text_embeddings,
7377                )
7378            else:
7379                return_value.video = torch.cat((return_value.video, embeddings[ModalityType.VISION]))
7380                return_value.audio = torch.cat((return_value.audio, embeddings[ModalityType.AUDIO]))
7381                return_value.description = torch.cat((return_value.description, text_embeddings))
7382        return return_value
7383
7384    async def embed_text_async(self, texts: List[str]) -> torch.Tensor:
7385        return await run_async(self.embed_text, texts)
7386
7387
7388
7389---
7390File: /omega/miner_utils.py
7391---
7392
7393from io import BytesIO
7394import os
7395import time
7396from typing import List, Tuple
7397
7398import soundfile as sf
7399import bittensor as bt
7400
7401from omega.protocol import VideoMetadata, AudioMetadata
7402from omega.imagebind_wrapper import ImageBind
7403from omega.constants import MAX_VIDEO_LENGTH, FIVE_MINUTES, MAX_AUDIO_LENGTH_SECONDS, MIN_AUDIO_LENGTH_SECONDS
7404from omega import video_utils
7405from omega.diarization_pipeline import CustomDiarizationPipeline
7406
7407if os.getenv("OPENAI_API_KEY"):
7408    from openai import OpenAI
7409    OPENAI_CLIENT = OpenAI()
7410else:
7411    OPENAI_CLIENT = None
7412
7413
7414def get_description(yt: video_utils.YoutubeDL, video_path: str) -> str:
7415    """
7416    Get / generate the description of a video from the YouTube API.
7417    
7418    Miner TODO: Implement logic to get / generate the most relevant and information-rich
7419    description of a video from the YouTube API.
7420    """
7421    description = yt.title
7422    if yt.description:
7423        description += f"\n\n{yt.description}"
7424    return description
7425
7426
7427def get_relevant_timestamps(query: str, yt: video_utils.YoutubeDL, video_path: str, max_length: int) -> Tuple[int, int]:
7428    """
7429    Get the optimal start and end timestamps (in seconds) of a video for ensuring relevance
7430    to the query.
7431
7432    Miner TODO: Implement logic to get the optimal start and end timestamps of a video for
7433    ensuring relevance to the query.
7434    """
7435    start_time = 0
7436    end_time = min(yt.length, max_length)
7437    return start_time, end_time
7438
7439
7440def search_and_embed_youtube_videos(query: str, num_videos: int, imagebind: ImageBind) -> List[VideoMetadata]:
7441    """
7442    Search YouTube for videos matching the given query and return a list of VideoMetadata objects.
7443
7444    Args:
7445        query (str): The query to search for.
7446        num_videos (int, optional): The number of videos to return.
7447
7448    Returns:
7449        List[VideoMetadata]: A list of VideoMetadata objects representing the search results.
7450    """
7451    # fetch more videos than we need
7452    results = video_utils.search_videos(query, max_results=int(num_videos * 1.5))
7453    video_metas = []
7454    try:
7455        # take the first N that we need
7456        for result in results:
7457            start = time.time()
7458            download_path = video_utils.download_youtube_video(
7459                result.video_id,
7460                start=0,
7461                end=min(result.length, FIVE_MINUTES)  # download the first 5 minutes at most
7462            )
7463            if download_path:
7464                clip_path = None
7465                try:
7466                    result.length = video_utils.get_video_duration(download_path.name)  # correct the length
7467                    bt.logging.info(f"Downloaded video {result.video_id} ({min(result.length, FIVE_MINUTES)}) in {time.time() - start} seconds")
7468                    start, end = get_relevant_timestamps(query, result, download_path, max_length=MAX_VIDEO_LENGTH)
7469                    description = get_description(result, download_path)
7470                    clip_path = video_utils.clip_video(download_path.name, start, end)
7471                    bt.logging.info(f"Clip video path: {clip_path}")
7472                    embeddings = imagebind.embed([description], [clip_path])
7473                    video_metas.append(VideoMetadata(
7474                        video_id=result.video_id,
7475                        description=description,
7476                        views=result.views,
7477                        start_time=start,
7478                        end_time=end,
7479                        video_emb=embeddings.video[0].tolist(),
7480                        audio_emb=embeddings.audio[0].tolist(),
7481                        description_emb=embeddings.description[0].tolist(),
7482                    ))
7483                finally:
7484                    download_path.close()
7485                    if clip_path:
7486                        clip_path.close()
7487            if len(video_metas) == num_videos:
7488                break
7489
7490    except Exception as e:
7491        bt.logging.error(f"Error searching for videos: {e}")
7492
7493    return video_metas
7494
7495
7496
7497
7498def search_and_diarize_youtube_videos(query: str, num_videos: int, diarization_pipeline: CustomDiarizationPipeline, imagebind: ImageBind) -> List[AudioMetadata]:
7499    """
7500    Search YouTube for videos matching the given query and return a list of AudioMetadata objects.
7501
7502    Args:
7503        query (str): The query to search for.
7504        num_videos (int, optional): The number of videos to return.
7505
7506    Returns:
7507        List[AudioMetadata]: A list of AudioMetadata objects representing the search results.
7508    """
7509    results = video_utils.search_videos(query, max_results=int(num_videos * 1.5))
7510    bt.logging.info(f"Audio Results: {results}")
7511    audio_metas = []
7512    try:
7513        # take the first N that we need
7514        for result in results:
7515            start_time_loop = time.time()
7516            download_path = video_utils.download_youtube_video(
7517                result.video_id,
7518                start=0,
7519                end=min(result.length, MAX_AUDIO_LENGTH_SECONDS)  # download the first 5 minutes at most
7520            )
7521            if download_path:
7522                clip_path = None
7523                try:
7524                    result.length = video_utils.get_video_duration(download_path.name)  # correct the length
7525                    bt.logging.info(f"Downloaded audio {result.video_id} ({min(result.length, MAX_AUDIO_LENGTH_SECONDS)}) in {time.time() - start_time_loop} seconds")
7526                    start, end = get_relevant_timestamps(query, result, download_path, max_length=MAX_AUDIO_LENGTH_SECONDS)
7527                    # bt.logging.info(f"Audio Start: {start}, End: {end}")
7528                    description = get_description(result, download_path)
7529                    audio_bytes = video_utils.get_audio_bytes(download_path.name)
7530                    audio_array, sr = sf.read(BytesIO(audio_bytes))
7531                    dataframe = diarization_pipeline.process(audio_array, sr)
7532                    diar_timestamps_start = dataframe["start"]
7533                    diar_timestamps_end = dataframe["end"]
7534                    diar_speakers = dataframe["speakers"]
7535                    clip_path = video_utils.clip_video(download_path.name, start, end)
7536                    bt.logging.info(f"Clip video path: {clip_path}")
7537                    embeddings = imagebind.embed([description], [clip_path])
7538                    bt.logging.info(f"Embeddings: {type(embeddings)}, audio_emb: {type(embeddings.audio[0])}, audio_array: {type(audio_array)} {audio_array.shape}, audio_bytes: {type(audio_bytes)}, sr: {sr}, diar_timestamps_start: {type(diar_timestamps_start)}, diar_timestamps_end: {type(diar_timestamps_end)}, diar_speakers: {type(diar_speakers)}")
7539                    bt.logging.info(f"Audio duration: {end - start}, actual length: {result.length}")
7540                    bt.logging.info("Diarization Dataframe: ", dataframe)
7541                    # Convert audio_bytes to base64 string for serialization
7542                    import base64
7543                    audio_bytes_b64 = base64.b64encode(audio_bytes).decode('utf-8')
7544                    
7545                    audio_metas.append(AudioMetadata(
7546                        video_id=result.video_id,
7547                        views=result.views,
7548                        start_time=start,
7549                        end_time=end,
7550                        audio_emb=embeddings.audio[0].tolist(),
7551                        audio_bytes=audio_bytes_b64,  # Store base64 encoded string instead of raw bytes
7552                        diar_timestamps_start=diar_timestamps_start,
7553                        diar_timestamps_end=diar_timestamps_end,
7554                        diar_speakers=diar_speakers,
7555                    ))
7556                finally:
7557                    download_path.close()
7558                    if clip_path:
7559                        clip_path.close()
7560            if len(audio_metas) == num_videos:
7561                break
7562            end_time_loop = time.time()
7563            bt.logging.info(f"Audio Time taken for loop: {end_time_loop - start_time_loop}")
7564
7565    except Exception as e:
7566        bt.logging.error(f"Error searching for videos: {e}")
7567
7568    return audio_metas
7569
7570
7571
7572
7573---
7574File: /omega/mock.py
7575---
7576
7577import time
7578
7579import asyncio
7580import random
7581import bittensor as bt
7582
7583from typing import List
7584
7585
7586class MockSubtensor(bt.MockSubtensor):
7587    def __init__(self, netuid, n=16, wallet=None, network="mock"):
7588        super().__init__(network=network)
7589
7590        if not self.subnet_exists(netuid):
7591            self.create_subnet(netuid)
7592
7593        # Register ourself (the validator) as a neuron at uid=0
7594        if wallet is not None:
7595            self.force_register_neuron(
7596                netuid=netuid,
7597                hotkey=wallet.hotkey.ss58_address,
7598                coldkey=wallet.coldkey.ss58_address,
7599                balance=100000,
7600                stake=100000,
7601            )
7602
7603        # Register n mock neurons who will be miners
7604        for i in range(1, n + 1):
7605            self.force_register_neuron(
7606                netuid=netuid,
7607                hotkey=f"miner-hotkey-{i}",
7608                coldkey="mock-coldkey",
7609                balance=100000,
7610                stake=100000,
7611            )
7612
7613
7614class MockMetagraph(bt.metagraph):
7615    def __init__(self, netuid=1, network="mock", subtensor=None):
7616        super().__init__(
7617            netuid=netuid, network=network, sync=False
7618        )
7619
7620        if subtensor is not None:
7621            self.subtensor = subtensor
7622        self.sync(subtensor=subtensor)
7623
7624        for axon in self.axons:
7625            axon.ip = "127.0.0.0"
7626            axon.port = 8091
7627
7628        bt.logging.info(f"Metagraph: {self}")
7629        bt.logging.info(f"Axons: {self.axons}")
7630
7631
7632class MockDendrite(bt.dendrite):
7633    """
7634    Replaces a real bittensor network request with a mock request that just returns some static response for all axons that are passed and adds some random delay.
7635    """
7636    def __init__(self, wallet):
7637        super().__init__(wallet)
7638
7639    async def forward(
7640        self,
7641        axons: List[bt.axon],
7642        synapse: bt.Synapse = bt.Synapse(),
7643        timeout: float = 12,
7644        deserialize: bool = True,
7645        run_async: bool = True,
7646        streaming: bool = False,
7647    ):
7648
7649        if streaming:
7650            raise NotImplementedError("Streaming not implemented yet.")
7651
7652        async def query_all_axons(streaming: bool):
7653            """Queries all axons for responses."""
7654
7655            async def single_axon_response(i, axon):
7656                """Queries a single axon for a response."""
7657
7658                start_time = time.time()
7659                s = synapse.copy()
7660                # Attach some more required data so it looks real
7661                s = self.preprocess_synapse_for_request(axon, s, timeout)
7662                # We just want to mock the response, so we'll just fill in some data
7663                process_time = random.random()
7664                if process_time < timeout:
7665                    s.dendrite.process_time = str(time.time() - start_time)
7666                    # Update the status code and status message of the dendrite to match the axon
7667                    # TODO (developer): replace with your own expected synapse data
7668                    s.dummy_output = s.dummy_input * 2
7669                    s.dendrite.status_code = 200
7670                    s.dendrite.status_message = "OK"
7671                    synapse.dendrite.process_time = str(process_time)
7672                else:
7673                    s.dummy_output = 0
7674                    s.dendrite.status_code = 408
7675                    s.dendrite.status_message = "Timeout"
7676                    synapse.dendrite.process_time = str(timeout)
7677
7678                # Return the updated synapse object after deserializing if requested
7679                if deserialize:
7680                    return s.deserialize()
7681                else:
7682                    return s
7683
7684            return await asyncio.gather(
7685                *(single_axon_response(i, target_axon) for i, target_axon in enumerate(axons))
7686            )
7687
7688        return await query_all_axons(streaming)
7689
7690    def __str__(self) -> str:
7691        """
7692        Returns a string representation of the Dendrite object.
7693
7694        Returns:
7695            str: The string representation of the Dendrite object in the format "dendrite(<user_wallet_address>)".
7696        """
7697        return "MockDendrite({})".format(self.keypair.ss58_address)
7698
7699
7700---
7701File: /omega/protocol.py
7702---
7703
7704# The MIT License (MIT)
7705# Copyright © 2023 Yuma Rao
7706# Copyright © 2023 Omega Labs, Inc.
7707
7708# Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated
7709# documentation files (the “Software”), to deal in the Software without restriction, including without limitation
7710# the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software,
7711# and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
7712
7713# The above copyright notice and this permission notice shall be included in all copies or substantial portions of
7714# the Software.
7715
7716# THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO
7717# THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
7718# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
7719# OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
7720# DEALINGS IN THE SOFTWARE.
7721
7722import typing
7723import json
7724
7725import bittensor as bt
7726from pydantic import BaseModel
7727
7728
7729class VideoMetadata(BaseModel):
7730    """
7731    A model class representing YouTube video metadata.
7732    """
7733    video_id: str
7734    description: str
7735    views: int
7736    start_time: int
7737    end_time: int
7738    video_emb: typing.List[float]
7739    audio_emb: typing.List[float]
7740    description_emb: typing.List[float]
7741
7742    def __repr_args__(self):
7743        parent_args = super().__repr_args__()
7744        exclude_args = ['video_emb', 'audio_emb', 'description_emb']
7745        return (
7746            [(a, v) for a, v in parent_args if a not in exclude_args] +
7747            [(a, ["..."]) for a in exclude_args]
7748        )
7749
7750
7751class Videos(bt.Synapse):
7752    """
7753    A synapse class representing a video scraping request and response.
7754
7755    Attributes:
7756    - query: the input query for which to find relevant videos
7757    - num_videos: the number of videos to return
7758    - video_metadata: a list of video metadata objects
7759    """
7760
7761    query: str
7762    num_videos: int
7763    video_metadata: typing.Optional[typing.List[VideoMetadata]] = None
7764
7765    def deserialize(self) -> typing.List[VideoMetadata]:
7766        assert self.video_metadata is not None
7767        return self.video_metadata
7768
7769    def to_serializable_dict(self, input_synapse: "Videos") -> dict:
7770        """
7771        Dumps the Videos object to a serializable dict, but makes sure to use input properties from
7772        the input_synapse, while taking the non-null output property video_metadata from the
7773        response (self).
7774        """
7775        json_str = self.replace_with_input(input_synapse).json(
7776            include={"query", "num_videos", "video_metadata"})
7777        return json.loads(json_str)
7778
7779    def replace_with_input(self, input_synapse: "Videos") -> "Videos":
7780        """
7781        Replaces the query and num_videos of current synapse with the given input synapse.
7782        """
7783        return Videos(
7784            query=input_synapse.query,
7785            num_videos=input_synapse.num_videos,
7786            video_metadata=self.video_metadata[:input_synapse.num_videos],
7787            axon=self.axon
7788        )
7789
7790
7791
7792
7793class AudioMetadata(BaseModel):
7794    video_id: str
7795    views: int
7796    start_time: int
7797    end_time: int
7798    audio_emb: typing.List[float]
7799    audio_bytes: typing.Optional[str] = None
7800    diar_timestamps_start: typing.List[float]
7801    diar_timestamps_end: typing.List[float]
7802    diar_speakers: typing.List[str]
7803
7804    def __repr_args__(self):
7805        parent_args = super().__repr_args__()
7806        exclude_args = ['audio_emb', 'audio_bytes', 'diar_timestamps_start', 'diar_timestamps_end', 'diar_speakers']
7807        return (
7808            [(a, v) for a, v in parent_args if a not in exclude_args] +
7809            [(a, ["..."]) for a in exclude_args]
7810        )
7811    
7812
7813class Audios(bt.Synapse):
7814    """
7815    A synapse class representing an audio request and response.
7816
7817    Attributes:
7818    - query: the input query for which to find relevant videos
7819    - num_audios: the number of audios to return
7820    - audio_metadata: an audio metadata object
7821    """
7822
7823    query: str
7824    num_audios: int
7825    audio_metadata: typing.Optional[typing.List[AudioMetadata]] = None
7826
7827    def deserialize(self) -> typing.List[AudioMetadata]:
7828        assert self.audio_metadata is not None
7829        return self.audio_metadata
7830
7831    def to_serializable_dict(self, input_synapse: "Audios") -> dict:
7832        """
7833        Dumps the Audio object to a serializable dict, but makes sure to use input properties from
7834        the input_synapse, while taking the non-null output property audio_metadata from the
7835        response (self).
7836        """
7837        json_str = self.replace_with_input(input_synapse).json(
7838            include={"query", "num_audios", "audio_metadata"})
7839        return json.loads(json_str)
7840
7841    def replace_with_input(self, input_synapse: "Audios") -> "Audios":
7842        """
7843        Replaces the query and num_audios of current synapse with the given input synapse.
7844        """
7845        return Audios(
7846            query=input_synapse.query,
7847            num_audios=input_synapse.num_audios,
7848            audio_metadata=self.audio_metadata,
7849            axon=self.axon
7850        )
7851
7852
7853
7854---
7855File: /omega/subnet_links.py
7856---
7857
7858SUBNET_LINKS = [
7859    {"name": "sn0", "url": ""},
7860    {"name": "sn1", "url": "https://github.com/opentensor/text-prompting/"},
7861    {"name": "sn2", "url": "https://github.com/bittranslateio/bittranslate/"},
7862    {
7863        "name": "sn3",
7864        "url": "https://github.com/gitphantomman/scraping_subnet/",
7865    },
7866    {"name": "sn4", "url": "https://github.com/manifold-inc/targon/"},
7867    {"name": "sn5", "url": "https://github.com/unconst/ImageSubnet/"},
7868    {"name": "sn6", "url": ""},
7869    {"name": "sn7", "url": "https://github.com/tensorage/tensorage/"},
7870    {
7871        "name": "sn8",
7872        "url": "https://github.com/taoshidev/time-series-prediction-subnet/",
7873    },
7874    {"name": "sn9", "url": "https://github.com/unconst/pretrain-subnet/"},
7875    {
7876        "name": "sn10",
7877        "url": "https://github.com/dream-well/map-reduce-subnet/",
7878    },
7879    {"name": "sn11", "url": "https://github.com/opentensor/text-prompting/"},
7880    {"name": "sn12", "url": ""},
7881    {"name": "sn13", "url": "https://github.com/RusticLuftig/data-universe/"},
7882    {
7883        "name": "sn14",
7884        "url": "https://github.com/ceterum1/llm-defender-subnet/",
7885    },
7886    {
7887        "name": "sn15",
7888        "url": "https://github.com/blockchain-insights/blockchain-data-subnet/",
7889    },
7890    {"name": "sn16", "url": "https://github.com/UncleTensor/AudioSubnet/"},
7891    {"name": "sn17", "url": "https://github.com/CortexLM/flavia/"},
7892    {"name": "sn18", "url": "https://github.com/corcel-api/cortex.t/"},
7893    {"name": "sn19", "url": "https://github.com/namoray/vision/"},
7894    {"name": "sn20", "url": "https://github.com/oracle-subnet/oracle-subnet/"},
7895    {"name": "sn21", "url": "https://github.com/ifrit98/storage-subnet/"},
7896    {"name": "sn22", "url": "https://github.com/surcyf123/smart-scrape/"},
7897    {"name": "sn23", "url": "https://github.com/NicheTensor/NicheImage/"},
7898    {"name": "sn24", "url": "https://github.com/eseckft/BitAds.ai/tree/main"},
7899    {"name": "sn25", "url": "https://github.com/KMFODA/DistributedTraining/"},
7900    {
7901        "name": "sn26",
7902        "url": "https://github.com/Supreme-Emperor-Wang/ImageAlchemy/",
7903    },
7904    {
7905        "name": "sn27",
7906        "url": "https://github.com/neuralinternet/compute-subnet/",
7907    },
7908    {"name": "sn28", "url": "https://github.com/zktensor/zktensor_subnet/"},
7909    {"name": "sn29", "url": "https://github.com/404-Repo/Subnet-29/"},
7910    {"name": "sn30", "url": ""},
7911    {
7912        "name": "sn31",
7913        "url": "https://github.com/bthealthcare/healthcare-subnet",
7914    },
7915    {"name": "sn32", "url": "https://github.com/RoyalTensor/roleplay/"},
7916]
7917
7918
7919
7920---
7921File: /omega/test_audio.py
7922---
7923
7924from omega.video_utils import get_audio_bytes
7925import base64
7926audio_bytes = get_audio_bytes("test_video.mp4")
7927print(audio_bytes)
7928
7929# Save audio bytes to a WAV file
7930with open('output_audio.wav', 'wb') as f:
7931    f.write(audio_bytes)
7932
7933audio_bytes_b64 = base64.b64encode(audio_bytes).decode('utf-8')
7934print(audio_bytes_b64)
7935# Save base64 encoded audio to file
7936with open('output_audio_b64.txt', 'w') as f:
7937    f.write(audio_bytes_b64)
7938
7939
7940
7941---
7942File: /omega/text_similarity.py
7943---
7944
7945import torch
7946import torch.nn.functional as F
7947from transformers import AutoModel, AutoTokenizer
7948
7949model_path = "Alibaba-NLP/gte-large-en-v1.5"
7950revision = "104333d6af6f97649377c2afbde10a7704870c7b"
7951TOKENIZER = AutoTokenizer.from_pretrained(model_path, revision=revision)
7952DEVICE = "cuda:0" if torch.cuda.is_available() else "cpu"
7953MODEL = AutoModel.from_pretrained(model_path, trust_remote_code=True, revision=revision).to(DEVICE)
7954MODEL.eval()
7955
7956def get_text_similarity_score(text_0, text_1):
7957    tokens = TOKENIZER([text_0, text_1], max_length=1024, padding=True, truncation=True, return_tensors='pt').to(DEVICE)
7958    outputs = MODEL(**tokens)
7959    embeddings = outputs.last_hidden_state[:, 0]
7960    embeddings = F.normalize(embeddings, p=2, dim=1)
7961    scores = (embeddings[:1] @ embeddings[1:].T) * 100
7962    return min(0.5, (scores.tolist()[0][0] / 100) ** 2)
7963
7964
7965
7966---
7967File: /omega/unstuff.py
7968---
7969
7970import torch
7971from transformers import pipeline
7972from typing import Tuple
7973import bittensor as bt
7974import random
7975import torch.nn.functional as F
7976from omega.imagebind_wrapper import (
7977    split_text_by_token_limit,
7978    SimpleTokenizer,
7979    BPE_PATH,
7980    split_text_by_token_limit,
7981)
7982CHUNK_SIZE = 60
7983TOKENIZER = SimpleTokenizer(bpe_path=BPE_PATH, context_length=10000)
7984
7985UNSTUFF = pipeline("text-classification", "jondurbin/unstuffer-v0.2", device="cuda" if torch.cuda.is_available() else "cpu")
7986
7987def is_stuffed(description: str) -> Tuple[bool, float]:
7988    result = UNSTUFF(description, truncation=True, max_length=512)
7989    stuffed = False if int(result[0]["label"]) == 1 else True
7990    confidence = result[0]["score"]
7991    if stuffed and confidence > 0.75:
7992        print(f"Detected stuffed description [{confidence=}]: {description}")
7993    elif not stuffed and random.random() <= 0.01:
7994        print(f"Description does not appear to be stuffed [{confidence=}]: {description}")
7995    return stuffed, confidence
7996
7997def check_extraneous_chunks(description, video_emb, audio_emb, imagebind):
7998    bt.logging.info(f"Length of description: {len(description)}")
7999    bt.logging.info(f"Length of video_emb: {len(video_emb)}")
8000    bt.logging.info(f"Length of audio_emb: {len(audio_emb)}")
8001    text_chunks = [
8002        chunk
8003        for chunk in split_text_by_token_limit(description, TOKENIZER, CHUNK_SIZE)
8004        if len(TOKENIZER(chunk)) >= 5
8005    ]
8006    if len(text_chunks) <= 1:
8007        return 0.0, 0.0, 0.0
8008    similarities = []
8009    for text in text_chunks:
8010        text_emb = imagebind.embed_text([text]).to("cpu")
8011        v_cosim = F.cosine_similarity(
8012            torch.tensor(video_emb).unsqueeze(0), text_emb
8013        ).tolist()[0]
8014        a_cosim = F.cosine_similarity(
8015            torch.tensor(audio_emb).unsqueeze(0), text_emb
8016        ).tolist()[0]
8017        similarities.append((v_cosim + a_cosim) / 2)
8018    best = max(similarities)
8019    low_quality = 0
8020    really_bad = 0
8021    for idx in range(len(similarities)):
8022        similarity = similarities[idx]
8023        text = text_chunks[idx]
8024        if similarity < best * 0.6:
8025            low_quality += 1
8026        if similarity < 0.12:
8027            really_bad += 1
8028    return really_bad, low_quality, len(similarities)
8029
8030
8031
8032---
8033File: /omega/video_utils.py
8034---
8035
8036import re
8037import json
8038import os
8039import tempfile
8040from typing import Optional, BinaryIO
8041import requests
8042import bittensor as bt
8043import ffmpeg
8044from pydantic import BaseModel
8045from yt_dlp import YoutubeDL
8046import librosa
8047import numpy as np
8048
8049from omega.constants import FIVE_MINUTES
8050
8051
8052def seconds_to_str(seconds):
8053    hours = seconds // 3600
8054    minutes = (seconds % 3600) // 60
8055    seconds = seconds % 60
8056    return f"{hours:02}:{minutes:02}:{seconds:02}"
8057
8058
8059def clip_video(video_path: str, start: int, end: int) -> Optional[BinaryIO]:
8060    temp_fileobj = tempfile.NamedTemporaryFile(suffix=".mp4")
8061    (
8062        ffmpeg
8063        .input(video_path, ss=seconds_to_str(start), to=seconds_to_str(end))
8064        .output(temp_fileobj.name, c="copy")  # copy flag prevents decoding and re-encoding
8065        .overwrite_output()
8066        .run(quiet=True)
8067    )
8068    return temp_fileobj
8069
8070
8071def skip_live(info_dict):
8072    """
8073    function to skip downloading if it's a live video (yt_dlp doesn't respect the 20 minute 
8074    download limit for live videos), and we don't want to hang on an hour long stream
8075    """
8076    if info_dict.get("is_live"):
8077        return "Skipping live video"
8078    return None
8079
8080
8081class YoutubeResult(BaseModel):
8082    video_id: str
8083    title: str
8084    description: Optional[str]
8085    length: int
8086    views: int
8087
8088
8089def search_videos(query, max_results=8):
8090    videos = []
8091    ydl_opts = {
8092        "format": "worst",
8093        "dumpjson": True,
8094        "extract_flat": True,
8095        "quiet": True,
8096        "simulate": True,
8097        "match_filter": skip_live,
8098    }
8099    with YoutubeDL(ydl_opts) as ydl:
8100        try:
8101            search_query = f"ytsearch{max_results}:{query}"
8102            result = ydl.extract_info(search_query, download=False)
8103            if "entries" in result and result["entries"]:
8104                videos = [
8105                    YoutubeResult(
8106                        video_id=entry["id"],
8107                        title=entry["title"],
8108                        description=entry.get("description"),
8109                        length=(int(entry.get("duration")) if entry.get("duration") else FIVE_MINUTES),
8110                        views=(entry.get("view_count") if entry.get("view_count") else 0),
8111                    ) for entry in result["entries"]
8112                ]
8113        except Exception as e:
8114            bt.logging.warning(f"Error searching for videos: {e}")
8115            return []
8116    return videos
8117
8118
8119def get_video_duration(filename: str) -> int:
8120    metadata = ffmpeg.probe(filename)
8121    video_stream = next((stream for stream in metadata['streams'] if stream['codec_type'] == 'video'), None)
8122    duration = int(float(video_stream['duration']))
8123    return duration
8124
8125
8126class IPBlockedException(Exception):
8127    def __init__(self, message: str):
8128        super().__init__(message)
8129
8130
8131class FakeVideoException(Exception):
8132    def __init__(self, message: str):
8133        super().__init__(message)
8134
8135
8136def is_valid_youtube_id(youtube_id: str) -> bool:
8137    return youtube_id is not None and len(youtube_id) == 11
8138
8139def download_youtube_video(
8140    video_id: str, start: Optional[int]=None, end: Optional[int]=None, proxy: Optional[str]=None
8141) -> Optional[BinaryIO]:
8142    if not is_valid_youtube_id(video_id):
8143        raise FakeVideoException(f"Invalid Youtube video ID: {video_id}")
8144
8145    video_url = f"https://www.youtube.com/watch?v={video_id}"
8146    
8147    temp_fileobj = tempfile.NamedTemporaryFile(suffix=".mp4")
8148    
8149    ydl_opts = {
8150        "format": "worst",  # Download the worst quality
8151        "outtmpl": temp_fileobj.name,  # Set the output template to the temporary file"s name
8152        "overwrites": True,
8153        "quiet": True,
8154        "noprogress": True,
8155        "match_filter": skip_live,
8156    }
8157
8158    if start is not None and end is not None:
8159        ydl_opts["download_ranges"] = lambda _, __: [{"start_time": start, "end_time": end}]
8160
8161    if proxy is not None:
8162        ydl_opts["proxy"] = proxy
8163
8164    try:
8165        with YoutubeDL(ydl_opts) as ydl:
8166            ydl.download([video_url])
8167
8168        # Check if the file is empty (download failed)
8169        if os.stat(temp_fileobj.name).st_size == 0:
8170            print(f"Error downloading Youtube video: {temp_fileobj.name} is empty")
8171            temp_fileobj.close()
8172            return None
8173
8174        return temp_fileobj
8175    except Exception as e:
8176        temp_fileobj.close()
8177        if (
8178            "Your IP is likely being blocked by Youtube" in str(e) or
8179            "Requested format is not available" in str(e)
8180        ):
8181            raise IPBlockedException(e)
8182
8183        # Quick check to see if miner passed an "unplayable" (sign-in required, paid video, etc.).
8184        fake_video = False
8185        try:
8186            result = requests.get(video_url, proxies={"https": proxy})
8187            json_match = re.search(r"ytInitialPlayerResponse\s*=\s*(\{(?:.*?)\})\s*;\s*<", result.text)
8188            if json_match:
8189                player_info = json.loads(json_match.group(1))
8190                status = player_info.get('playabilityStatus', {}).get('status', 'ok')
8191                unacceptable_statuses = ('UNPLAYABLE',)
8192                if status in unacceptable_statuses or (status == 'ERROR' and player_info['playabilityStatus'].get('reason', '').lower() == 'video unavailable'):
8193                    if "sign in to confirm you’re not a bot" not in result.text.lower():
8194                        if player_info['playabilityStatus']['errorScreen']['playerErrorMessageRenderer']['subreason']['simpleText'] != "This content isn’t available.":
8195                            fake_video = True
8196                            print(f"Fake video submitted, youtube player status [{status}]: {player_info['playabilityStatus']}")
8197        except Exception as fake_check_exc:
8198            print(f"Error sanity checking playability: {fake_check_exc}")
8199        if fake_video:
8200            raise FakeVideoException("Unplayable video provided")
8201        if any(fake_vid_msg in str(e) for fake_vid_msg in ["Video unavailable", "is not a valid URL", "Incomplete YouTube ID", "Unsupported URL"]):
8202            if "Video unavailable. This content isn’t available." not in str(e):
8203                raise FakeVideoException(e)
8204        print(f"Error downloading video: {e}")
8205        return None
8206
8207
8208def copy_audio(video_path: str) -> BinaryIO:
8209    temp_audiofile = tempfile.NamedTemporaryFile(suffix=".aac")
8210    (
8211        ffmpeg
8212        .input(video_path)
8213        .output(temp_audiofile.name, vn=None, acodec='copy')
8214        .overwrite_output()
8215        .run(quiet=True)
8216    )
8217    return temp_audiofile
8218
8219def copy_audio_wav(video_path: str) -> BinaryIO:
8220    """
8221    Extract audio from video file to 16-bit PCM WAV format.
8222
8223    Args:
8224        video_path: Path to input video
8225
8226    Returns:
8227        BinaryIO: Temporary file containing WAV audio
8228    """
8229    temp_audiofile = tempfile.NamedTemporaryFile(suffix=".wav")
8230
8231    (
8232        ffmpeg
8233        .input(video_path)
8234        .output(
8235            temp_audiofile.name,
8236            acodec='pcm_s16le',  # 16-bit PCM
8237            ac=1,                # mono
8238            ar=16000,            # 16kHz sample rate
8239            vn=None             # no video
8240        )
8241        .overwrite_output()
8242        .run(quiet=True)
8243    )
8244
8245    return temp_audiofile
8246
8247def get_audio_bytes(video_path: str) -> bytes:
8248    audio_file = copy_audio_wav(video_path)
8249    with open(audio_file.name, 'rb') as f:
8250        wav_bytes = f.read()
8251
8252    # Clean up temp file
8253    audio_file.close()
8254
8255    # NOTE: MINERS, you cannot change the sample rate here or we will not be able to score your audio
8256    return wav_bytes
8257
8258
8259
8260---
8261File: /scripts/check_compatibility.sh
8262---
8263
8264#!/bin/bash
8265
8266if [ -z "$1" ]; then
8267    echo "Please provide a Python version as an argument."
8268    exit 1
8269fi
8270
8271python_version="$1"
8272all_passed=true
8273
8274GREEN='\033[0;32m'
8275YELLOW='\033[0;33m'
8276RED='\033[0;31m'
8277NC='\033[0m' # No Color
8278
8279check_compatibility() {
8280    all_supported=0
8281
8282    while read -r requirement; do
8283        # Skip lines starting with git+
8284        if [[ "$requirement" == git+* ]]; then
8285            continue
8286        fi
8287
8288        package_name=$(echo "$requirement" | awk -F'[!=<>]' '{print $1}' | awk -F'[' '{print $1}') # Strip off brackets
8289        echo -n "Checking $package_name... "
8290
8291        url="https://pypi.org/pypi/$package_name/json"
8292        response=$(curl -s $url)
8293        status_code=$(curl -s -o /dev/null -w "%{http_code}" $url)
8294
8295        if [ "$status_code" != "200" ]; then
8296            echo -e "${RED}Information not available for $package_name. Failure.${NC}"
8297            all_supported=1
8298            continue
8299        fi
8300
8301        classifiers=$(echo "$response" | jq -r '.info.classifiers[]')
8302        requires_python=$(echo "$response" | jq -r '.info.requires_python')
8303
8304        base_version="Programming Language :: Python :: ${python_version%%.*}"
8305        specific_version="Programming Language :: Python :: $python_version"
8306
8307        if echo "$classifiers" | grep -q "$specific_version" || echo "$classifiers" | grep -q "$base_version"; then
8308            echo -e "${GREEN}Supported${NC}"
8309        elif [ "$requires_python" != "null" ]; then
8310            if echo "$requires_python" | grep -Eq "==$python_version|>=$python_version|<=$python_version"; then
8311                echo -e "${GREEN}Supported${NC}"
8312            else
8313                echo -e "${RED}Not compatible with Python $python_version due to constraint $requires_python.${NC}"
8314                all_supported=1
8315            fi
8316        else
8317            echo -e "${YELLOW}Warning: Specific version not listed, assuming compatibility${NC}"
8318        fi
8319    done < requirements.txt
8320
8321    return $all_supported
8322}
8323
8324echo "Checking compatibility for Python $python_version..."
8325check_compatibility
8326if [ $? -eq 0 ]; then
8327    echo -e "${GREEN}All requirements are compatible with Python $python_version.${NC}"
8328else
8329    echo -e "${RED}All requirements are NOT compatible with Python $python_version.${NC}"
8330    all_passed=false
8331fi
8332
8333echo ""
8334if $all_passed; then
8335    echo -e "${GREEN}All tests passed.${NC}"
8336else
8337    echo -e "${RED}All tests did not pass.${NC}"
8338    exit 1
8339fi
8340
8341
8342
8343---
8344File: /scripts/check_requirements_changes.sh
8345---
8346
8347#!/bin/bash
8348
8349# Check if requirements files have changed in the last commit
8350if git diff --name-only HEAD~1 | grep -E 'requirements.txt|requirements.txt'; then
8351    echo "Requirements files have changed. Running compatibility checks..."
8352    echo 'export REQUIREMENTS_CHANGED="true"' >> $BASH_ENV
8353else
8354    echo "Requirements files have not changed. Skipping compatibility checks..."
8355    echo 'export REQUIREMENTS_CHANGED="false"' >> $BASH_ENV
8356fi
8357
8358
8359
8360---
8361File: /scripts/install_staging.sh
8362---
8363
8364#!/bin/bash
8365
8366# Section 1: Build/Install
8367# This section is for first-time setup and installations.
8368
8369install_dependencies() {
8370    # Function to install packages on macOS
8371    install_mac() {
8372        which brew > /dev/null
8373        if [ $? -ne 0 ]; then
8374            echo "Installing Homebrew..."
8375            /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
8376        fi
8377        echo "Updating Homebrew packages..."
8378        brew update
8379        echo "Installing required packages..."
8380        brew install make llvm curl libssl protobuf tmux
8381    }
8382
8383    # Function to install packages on Ubuntu/Debian
8384    install_ubuntu() {
8385        echo "Updating system packages..."
8386        sudo apt update
8387        echo "Installing required packages..."
8388        sudo apt install --assume-yes make build-essential git clang curl libssl-dev llvm libudev-dev protobuf-compiler tmux
8389    }
8390
8391    # Detect OS and call the appropriate function
8392    if [[ "$OSTYPE" == "darwin"* ]]; then
8393        install_mac
8394    elif [[ "$OSTYPE" == "linux-gnu"* ]]; then
8395        install_ubuntu
8396    else
8397        echo "Unsupported operating system."
8398        exit 1
8399    fi
8400
8401    # Install rust and cargo
8402    curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
8403
8404    # Update your shell's source to include Cargo's path
8405    source "$HOME/.cargo/env"
8406}
8407
8408# Call install_dependencies only if it's the first time running the script
8409if [ ! -f ".dependencies_installed" ]; then
8410    install_dependencies
8411    touch .dependencies_installed
8412fi
8413
8414
8415# Section 2: Test/Run
8416# This section is for running and testing the setup.
8417
8418# Create a coldkey for the owner role
8419wallet=${1:-owner}
8420
8421# Logic for setting up and running the environment
8422setup_environment() {
8423    # Clone subtensor and enter the directory
8424    if [ ! -d "subtensor" ]; then
8425        git clone https://github.com/opentensor/subtensor.git
8426    fi
8427    cd subtensor
8428    git pull
8429
8430    # Update to the nightly version of rust
8431    ./scripts/init.sh
8432
8433    cd ../bittensor-subnet-template
8434
8435    # Install the bittensor-subnet-template python package
8436    python -m pip install -e .
8437
8438    # Create and set up wallets
8439    # This section can be skipped if wallets are already set up
8440    if [ ! -f ".wallets_setup" ]; then
8441        btcli wallet new_coldkey --wallet.name $wallet --no_password --no_prompt
8442        btcli wallet new_coldkey --wallet.name miner --no_password --no_prompt
8443        btcli wallet new_hotkey --wallet.name miner --wallet.hotkey default --no_prompt
8444        btcli wallet new_coldkey --wallet.name validator --no_password --no_prompt
8445        btcli wallet new_hotkey --wallet.name validator --wallet.hotkey default --no_prompt
8446        touch .wallets_setup
8447    fi
8448
8449}
8450
8451# Call setup_environment every time
8452setup_environment 
8453
8454## Setup localnet
8455# assumes we are in the bittensor-subnet-template/ directory
8456# Initialize your local subtensor chain in development mode. This command will set up and run a local subtensor network.
8457cd ../subtensor
8458
8459# Start a new tmux session and create a new pane, but do not switch to it
8460echo "FEATURES='pow-faucet runtime-benchmarks' BT_DEFAULT_TOKEN_WALLET=$(cat ~/.bittensor/wallets/$wallet/coldkeypub.txt | grep -oP '"ss58Address": "\K[^"]+') bash scripts/localnet.sh" >> setup_and_run.sh
8461chmod +x setup_and_run.sh
8462tmux new-session -d -s localnet -n 'localnet'
8463tmux send-keys -t localnet 'bash ../subtensor/setup_and_run.sh' C-m
8464
8465# Notify the user
8466echo ">> localnet.sh is running in a detached tmux session named 'localnet'"
8467echo ">> You can attach to this session with: tmux attach-session -t localnet"
8468
8469# Register a subnet (this needs to be run each time we start a new local chain)
8470btcli subnet create --wallet.name $wallet --wallet.hotkey default --subtensor.chain_endpoint ws://127.0.0.1:9946 --no_prompt
8471
8472# Transfer tokens to miner and validator coldkeys
8473export BT_MINER_TOKEN_WALLET=$(cat ~/.bittensor/wallets/miner/coldkeypub.txt | grep -oP '"ss58Address": "\K[^"]+')
8474export BT_VALIDATOR_TOKEN_WALLET=$(cat ~/.bittensor/wallets/validator/coldkeypub.txt | grep -oP '"ss58Address": "\K[^"]+')
8475
8476btcli wallet transfer --subtensor.network ws://127.0.0.1:9946 --wallet.name $wallet --dest $BT_MINER_TOKEN_WALLET --amount 1000 --no_prompt
8477btcli wallet transfer --subtensor.network ws://127.0.0.1:9946 --wallet.name $wallet --dest $BT_VALIDATOR_TOKEN_WALLET --amount 10000 --no_prompt
8478
8479# Register wallet hotkeys to subnet
8480btcli subnet register --wallet.name miner --netuid 1 --wallet.hotkey default --subtensor.chain_endpoint ws://127.0.0.1:9946 --no_prompt
8481btcli subnet register --wallet.name validator --netuid 1 --wallet.hotkey default --subtensor.chain_endpoint ws://127.0.0.1:9946 --no_prompt
8482
8483# Add stake to the validator
8484btcli stake add --wallet.name validator --wallet.hotkey default --subtensor.chain_endpoint ws://127.0.0.1:9946 --amount 10000 --no_prompt
8485
8486# Ensure both the miner and validator keys are successfully registered.
8487btcli subnet list --subtensor.chain_endpoint ws://127.0.0.1:9946
8488btcli wallet overview --wallet.name validator --subtensor.chain_endpoint ws://127.0.0.1:9946 --no_prompt
8489btcli wallet overview --wallet.name miner --subtensor.chain_endpoint ws://127.0.0.1:9946 --no_prompt
8490
8491cd ../bittensor-subnet-template
8492
8493
8494# Check if inside a tmux session
8495if [ -z "$TMUX" ]; then
8496    # Start a new tmux session and run the miner in the first pane
8497    tmux new-session -d -s bittensor -n 'miner' 'python neurons/miner.py --netuid 1 --subtensor.chain_endpoint ws://127.0.0.1:9946 --wallet.name miner --wallet.hotkey default --logging.debug'
8498    
8499    # Split the window and run the validator in the new pane
8500    tmux split-window -h -t bittensor:miner 'python neurons/validator.py --netuid 1 --subtensor.chain_endpoint ws://127.0.0.1:9946 --wallet.name validator --wallet.hotkey default --logging.debug'
8501    
8502    # Attach to the new tmux session
8503    tmux attach-session -t bittensor
8504else
8505    # If already in a tmux session, create two panes in the current window
8506    tmux split-window -h 'python neurons/miner.py --netuid 1 --subtensor.chain_endpoint ws://127.0.0.1:9946 --wallet.name miner --wallet.hotkey default --logging.debug'
8507    tmux split-window -v -t 0 'python neurons/validator.py --netuid 1 --subtensor.chain_endpoint ws://127.0.0.1:9946 --wallet.name3 validator --wallet.hotkey default --logging.debug'
8508fi
8509
8510
8511
8512---
8513File: /validator-api/static/dashboard.html
8514---
8515
8516<!DOCTYPE html>
8517<html lang="en">
8518<head>
8519    <meta charset="UTF-8">
8520    <meta name="viewport" content="width=device-width, initial-scale=1.0">
8521    <link rel="icon" href="static/favicon.ico" />
8522    <title>OMEGA Metadata Dashboard</title>
8523    <script src="https://cdn.jsdelivr.net/npm/[email protected]/dist/vue.min.js"></script>
8524    <script src="https://cdn.jsdelivr.net/npm/axios/dist/axios.min.js"></script>
8525    <style>
8526        /* Apply base font styles */
8527        html, body {
8528            font-family: Roboto, sans-serif;
8529            line-height: 1.5;
8530            height: 101%; /* Ensure the html and body elements take up the full height of the window */
8531            margin: 0; /* Reset any default margin */
8532            padding: 0; /* Reset any default padding */
8533        }
8534
8535        body {
8536            font-size: 16px;
8537            line-height: 1.6;
8538            font-weight: 400;
8539            background-color: #0a1128;
8540            background-image: 
8541                linear-gradient(
8542                    to bottom,
8543                    rgba(255, 255, 255, 0) 0%, /* Fully transparent */
8544                    rgba(255, 255, 255, 0) calc(100% - 700px), /* Transparent until 200px from the bottom */
8545                    #0a1128 calc(100% - 200px),
8546                    #0a1128 100% /* Transition to the background color over the last 200px */
8547                ),
8548                url(https://omegatron.ai/static/images/0423e77f5905b1f1bccb.png);
8549            background-size: cover;
8550            background-repeat: no-repeat;
8551            background-position: center;
8552            color: #ffffff; /* Light text color for better readability */
8553        }
8554        /*
8555        body::before {
8556            position: absolute;
8557            content: "";
8558            width: 100%;
8559            height: 100%;
8560            top: 0;
8561            left: 0;
8562            background-image: linear-gradient(to bottom, #0a1128 0%, rgba(10, 17, 40, 0.8078431373) 30%, rgba(10, 17, 40, 0.5607843137) 60%, rgba(10, 17, 40, 0.1450980392) 95%) !important;
8563            z-index: 1;
8564        }*/
8565        .logo {
8566            display: block; /* Use block to apply margin auto for centering */
8567            width: 75px; /* Set the width of the logo container */
8568            height: 75px; /* Set the height of the logo container */
8569            margin: 0 auto; /* Center the logo horizontally */
8570            margin-top: 2rem; /* Add space above the logo */
8571        }
8572
8573        .logo svg {
8574            width: 100%; /* Make the SVG fill the container */
8575            height: 100%; /* Make the SVG fill the container */
8576        }
8577
8578        h1 {
8579            text-align: center;
8580            font-size: 2.5rem;
8581            margin-bottom: 3rem;
8582            margin-top: 0;
8583            text-shadow: 3px 3px 4px rgba(0, 0, 0, 0.75);
8584        }
8585
8586        /* Table styles */
8587        table {
8588            width: 90%;
8589            margin: 0 auto; /* Center table horizontally */
8590            border-collapse: collapse;
8591            text-indent: 0;
8592            color: #ffffff; /* Ensure table text is light-colored */
8593            border-radius: 10px; /* Rounded corners */
8594            box-shadow: 4px 4px 8px 0 rgba(70, 70, 70, 0.3); /* Drop shadow */
8595        }
8596
8597        th.center {
8598            text-align: center;
8599        }
8600
8601        .width520 {
8602            width: 520px;
8603        }
8604
8605        .width20 {
8606            width: 20px;
8607        }
8608
8609        /* Style for table headers and cells to inherit the rounded corners */
8610        th, td {
8611            /*border: 1px solid #ddd;  Light gray border for cells */
8612            padding: 8px; /* Padding for cell content */
8613            text-align: left;
8614            width: 10%;
8615        }
8616
8617        td {
8618            cursor: pointer;
8619        }
8620
8621        th {
8622            background-color: #272727; /* Dark background for headers */
8623            color: #ffffff; /* Light text color for headers */
8624            font-weight: bold; /* Bold font weight for better readability */
8625        }
8626
8627        /* Style for the first and last cells in each row to inherit the rounded corners */
8628        th:first-child {
8629            border-top-left-radius: 10px; /* Top-left rounded corner */
8630        }
8631
8632        th:last-child {
8633            border-top-right-radius: 10px; /* Top-right rounded corner */
8634        }
8635
8636        /* Style for the last row to inherit the rounded corners */
8637        tr:last-child td:first-child {
8638            border-bottom-left-radius: 10px; /* Bottom-left rounded corner */
8639        }
8640
8641        tr:last-child td:last-child {
8642            border-bottom-right-radius: 10px; /* Bottom-right rounded corner */
8643        }
8644
8645        /* Body styles */
8646        tbody tr:nth-child(odd) {
8647            background-color: #162035; /* Dark background for odd rows */
8648        }
8649
8650        tbody tr:nth-child(even) {
8651            background-color: #1f2a48; /* Slightly different dark background for even rows */
8652        }
8653
8654        /* Footer styles */
8655        tfoot {
8656            font-weight: bold;
8657            background-color: #1f2a48; /* Consistent background for footer */
8658        }
8659
8660        .refresh-icon {
8661            cursor: pointer;
8662        }
8663
8664        .sortable {
8665            cursor: pointer;
8666        }
8667
8668        .arrow {
8669            display: inline-block;
8670            margin-left: 5px;
8671        }
8672
8673        .arrow-up::before {
8674            content: '▲';
8675        }
8676
8677        .arrow-down::before {
8678            content: '▼';
8679        }
8680
8681        input[type="text"] {
8682            width: 30%; /* Match the table width or adjust as needed */
8683            padding: 10px; /* Larger padding for a taller input field */
8684            margin-bottom: 20px; /* Space between the input field and the table */
8685            font-size: 16px; /* Larger font size for better readability */
8686            border: 1px solid #ccc; /* Subtle border color */
8687            border-radius: 5px; /* Slightly rounded corners */
8688            box-shadow: inset 0 1px 3px rgba(0, 0, 0, 0.1); /* Inner shadow for depth */
8689            display: block; /* Ensure it's a block-level element */
8690            margin-left: auto; /* Combined with margin-right: auto, centers the input */
8691            margin-right: auto;
8692        }
8693
8694        .input-social-container {
8695            display: flex;
8696            align-items: center;
8697            justify-content: space-between;
8698        }
8699
8700        .social-icons {
8701            position: absolute;
8702            right: 5%;
8703            display: flex;
8704            align-items: center;
8705        }
8706
8707        .social-icons button {
8708            background: none;
8709            border: none;
8710            cursor: pointer;
8711        }
8712
8713        .social-icon {
8714            display: flex;
8715            justify-content: center;
8716            align-items: center;
8717            width: 50px; /* Adjust size as needed */
8718            height: 50px; /* Adjust size as needed */
8719            border-radius: 50%; /* Make it circular */
8720            border: 1px solid #ccc; /* Light gray border */
8721            margin-left: 15px; /* Space between icons */
8722            overflow: hidden; /* Ensure the content fits the circular shape */
8723            margin-bottom: 2em;
8724        }
8725
8726        .social-icon img,
8727        .social-icon svg {
8728            width: 100%;
8729            height: 100%;
8730            display: block;
8731            object-fit: cover; /* Ensure the image covers the area */
8732        }
8733
8734        .youtube-embed {
8735            width: 100%;
8736            height: 315px;
8737        }
8738
8739        .pagination {
8740            display: flex;
8741            justify-content: center;
8742            align-items: center;
8743            margin-top: 20px; /* Adjust the margin as needed */
8744            padding-top: 10px; /* Adjust the padding as needed */
8745        }
8746
8747        .pagination button {
8748            background-color: #068AC7;
8749            color: white;
8750            border: none;
8751            padding: 10px 20px;
8752            margin: 0 5px;
8753            cursor: pointer;
8754            border-radius: 5px;
8755            font-size: 16px;
8756        }
8757
8758        .pagination button:disabled {
8759            background-color: #cccccc;
8760            cursor: not-allowed;
8761        }
8762
8763        .pagination span {
8764            font-size: 16px;
8765            margin: 0 10px;
8766        }
8767
8768        /* Responsive styles for smaller screens */
8769        @media (max-width: 768px) {
8770            body {
8771                font-size: 0.9em; /* Smaller font size on mobile */
8772            }
8773
8774            h1 {
8775                font-size: 1.5rem; /* Adjust heading size for mobile */
8776            }
8777
8778            .logo {
8779                width: 30%; /* Increase width percentage for smaller screens */
8780            }
8781
8782            input[type="text"] {
8783                width: 80%; /* Increase width for mobile */
8784                padding: 8px; /* Adjust padding */
8785                font-size: 1em; /* Adjust font size */
8786            }
8787
8788            table {
8789                width: 100%; /* Full width on mobile */
8790            }
8791        }
8792    </style>
8793</head>
8794<body>
8795    <div id="app">
8796        <div class="logo">
8797            <svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 75 75">
8798                <!-- Define the drop shadow filter -->
8799                <defs>
8800                    <filter id="text-shadow" x="-20%" y="-20%" width="140%" height="140%">
8801                        <feGaussianBlur in="SourceAlpha" stdDeviation="2" result="blur"/>
8802                        <feOffset in="blur" dx="2" dy="2" result="offsetBlur"/>
8803                        <feMerge>
8804                            <feMergeNode in="offsetBlur"/>
8805                            <feMergeNode in="SourceGraphic"/>
8806                        </feMerge>
8807                    </filter>
8808                </defs>
8809                <text x="50%" y="70%" dominant-baseline="middle" text-anchor="middle" font-family="Roboto" font-size="100" fill="#068AC7" filter="url(#text-shadow)">Ω</text>
8810            </svg>
8811        </div>
8812        <h1>OMEGA Metadata Dashboard</h1>
8813        <div class="input-social-container">
8814            <!--<input type="text" v-model="filterKey" placeholder="Filter by hotkey...">-->
8815            <br /><br />
8816            <div class="social-icons">
8817                <a href="https://twitter.com/omegalabsai" target="_blank" class="social-icon"><button class="" type="button"><span class=""><img src="https://omegatron.ai/static/images/16b3234e15bf0aece98c.png"></span></button></a>
8818                <a href="https://github.com/omegalabsinc" target="_blank" class="social-icon"><button class="" type="button"><span class=""><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" fill="none"><path fill="#fff" d="M12 2.247a10 10 0 0 0-3.162 19.487c.5.088.687-.212.687-.475 0-.237-.012-1.025-.012-1.862-2.513.462-3.163-.613-3.363-1.175a3.64 3.64 0 0 0-1.025-1.413c-.35-.187-.85-.65-.012-.662a2 2 0 0 1 1.537 1.025 2.137 2.137 0 0 0 2.913.825c.043-.509.27-.984.637-1.338-2.225-.25-4.55-1.112-4.55-4.937a3.9 3.9 0 0 1 1.025-2.688 3.6 3.6 0 0 1 .1-2.65s.837-.262 2.75 1.025a9.43 9.43 0 0 1 5 0c1.912-1.3 2.75-1.025 2.75-1.025.37.838.406 1.786.1 2.65a3.87 3.87 0 0 1 1.025 2.688c0 3.837-2.337 4.687-4.562 4.937a2.37 2.37 0 0 1 .675 1.85c0 1.338-.013 2.413-.013 2.75 0 .263.188.575.688.475A10.005 10.005 0 0 0 12 2.247"></path></svg></span></button></a>
8819            </div>
8820        </div>
8821        <table>
8822            <thead>
8823                <tr>
8824                    <th class="sortable" @click="sortBy('video_id')">Video ID<span v-if="sortKey === 'video_id'" class="arrow" :class="{'arrow-up': sortOrder > 0, 'arrow-down': sortOrder < 0}"></span></th>
8825                    <th class="sortable" @click="sortBy('youtube_id')">YouTube ID<span v-if="sortKey === 'youtube_id'" class="arrow" :class="{'arrow-up': sortOrder > 0, 'arrow-down': sortOrder < 0}"></span></th>
8826                    <th class="sortable" @click="sortBy('start_time')">Start<span v-if="sortKey === 'start_time'" class="arrow" :class="{'arrow-up': sortOrder > 0, 'arrow-down': sortOrder < 0}"></span></th>
8827                    <th class="sortable" @click="sortBy('end_time')">End<span v-if="sortKey === 'end_time'" class="arrow" :class="{'arrow-up': sortOrder > 0, 'arrow-down': sortOrder < 0}"></span></th>
8828                    <th class="sortable width520" @click="sortBy('description')">Description<span v-if="sortKey === 'description'" class="arrow" :class="{'arrow-up': sortOrder > 0, 'arrow-down': sortOrder < 0}"></span></th>
8829                    <th class="sortable" @click="sortBy(5)">Desc Rel<span v-if="sortKey === 5" class="arrow" :class="{'arrow-up': sortOrder > 0, 'arrow-down': sortOrder < 0}"></span></th>
8830                    <th class="sortable" @click="sortBy(6)">Query Rel<span v-if="sortKey === 6" class="arrow" :class="{'arrow-up': sortOrder > 0, 'arrow-down': sortOrder < 0}"></span></th>
8831                    <th class="sortable" @click="sortBy('query')">Query<span v-if="sortKey === 'query'" class="arrow" :class="{'arrow-up': sortOrder > 0, 'arrow-down': sortOrder < 0}"></span></th>
8832                    <th class="sortable" @click="sortBy(8)">Submitted<span v-if="sortKey === 8" class="arrow" :class="{'arrow-up': sortOrder > 0, 'arrow-down': sortOrder < 0}"></span></th>
8833                    <th class="width20"><span class="refresh-icon" @click="fetchData">↻</span></th>
8834                </tr>
8835            </thead>
8836            <tbody>
8837                <template v-for="(video, index) in filteredVideos" :key="video.video_id">
8838                    <tr>
8839                        <td @click="toggleRow(index)">{{ video[0] }}</td>
8840                        <td @click="toggleRow(index)">{{ video[1] }}</td>
8841                        <td @click="toggleRow(index)">{{ video[2] }}</td>
8842                        <td @click="toggleRow(index)">{{ video[3] }}</td>
8843                        <td class="width520" @click="toggleRow(index)">{{ video[4] }}</td>
8844                        <td @click="toggleRow(index)">{{ video[5] }}</td>
8845                        <td @click="toggleRow(index)">{{ video[6] }}</td>
8846                        <td @click="toggleRow(index)">{{ video[7] }}</td>
8847                        <td @click="toggleRow(index)">{{ video[8] }}</td>
8848                        <td class="width20"></td>
8849                    </tr>
8850                    <tr v-if="expandedRow === index" :key="'expanded-' + video.video_id">
8851                        <td colspan="10">
8852                            <iframe class="youtube-embed" :src="getYoutubeEmbedUrl(video[1], video[2], video[3])" frameborder="0" allow="autoplay; encrypted-media" allowfullscreen></iframe>
8853                        </td>
8854                    </tr>
8855                </template>
8856            </tbody>
8857        </table>
8858        <div class="pagination">
8859            <button @click="prevPage" :disabled="currentPage === 1">Previous</button>
8860            <span>Page {{ currentPage }} of {{ totalPages }}</span>
8861            <button @click="nextPage" :disabled="currentPage === totalPages">Next</button>
8862        </div>
8863    </div>
8864    <div> </div>
8865
8866    <script>
8867        new Vue({
8868            el: '#app',
8869            data: {
8870                videos: [],
8871                filterKey: '',
8872                sortKey: 'submitted_at',
8873                sortOrder: "desc",
8874                expandedRow: null,
8875                currentPage: 1,
8876                itemsPerPage: 1000,
8877                totalItems: 0
8878            },
8879            computed: {
8880                filteredVideos() {
8881                    //return this.videos;
8882                    let sortedVideos = [...this.videos].sort((a, b) => {
8883                        let modifier = this.sortOrder;
8884                        let aValue = a[this.sortKey];
8885                        let bValue = b[this.sortKey];
8886
8887                        // Convert to lowercase if sorting by string
8888                        if (typeof aValue === 'string') {
8889                            aValue = aValue.toLowerCase();
8890                            bValue = bValue.toLowerCase();
8891                        }
8892
8893                        if (aValue < bValue) return -1 * modifier;
8894                        if (aValue > bValue) return 1 * modifier;
8895                        return 0;
8896                    });
8897
8898                    return sortedVideos.filter(video => {
8899                        return video[0].toLowerCase().includes(this.filterKey.toLowerCase());
8900                    });
8901                },
8902                totalPages() {
8903                    return Math.ceil(this.totalItems / this.itemsPerPage);
8904                }
8905            },
8906            methods: {
8907                fetchData() {
8908                    axios.get('/dashboard/get-video-metadata', {
8909                        params: {
8910                            sort_by: this.sortKey,
8911                            sort_order: this.sortOrder,
8912                            page: this.currentPage,
8913                            items_per_page: this.itemsPerPage
8914                        }
8915                    })
8916                    .then(response => {
8917                        this.videos = response.data.data;
8918                        this.totalItems = response.data.total_items;
8919                    })
8920                    .catch(error => {
8921                        console.error('There was an error fetching the video metadata:', error);
8922                    });
8923                },
8924                sortBy(key) {
8925                    if (this.sortKey === key) {
8926                        this.sortOrder *= "";
8927                    } else {
8928                        this.sortKey = key;
8929                        this.sortOrder = "desc";
8930                    }
8931                },
8932                toggleRow(index) {
8933                    if (this.expandedRow === index) {
8934                        this.expandedRow = null;
8935                    } else {
8936                        this.expandedRow = index;
8937                    }
8938                },
8939                getYoutubeEmbedUrl(youtubeId, startTime, endTime) {
8940                    return `https://www.youtube.com/embed/${youtubeId}?start=${startTime}&end=${endTime}&autoplay=1`;
8941                },
8942                prevPage() {
8943                    if (this.currentPage > 1) {
8944                        this.currentPage--;
8945                        this.fetchData();
8946                    }
8947                },
8948                nextPage() {
8949                    if (this.currentPage < this.totalPages) {
8950                        this.currentPage++;
8951                        this.fetchData();
8952                    }
8953                }
8954            },
8955            mounted() {
8956                this.fetchData();
8957            }
8958        });
8959    </script>
8960</body>
8961</html>
8962
8963
8964---
8965File: /validator-api/static/leaderboard.html
8966---
8967
8968<!DOCTYPE html>
8969<html lang="en">
8970<head>
8971    <meta charset="UTF-8">
8972    <meta name="viewport" content="width=device-width, initial-scale=1.0">
8973    <link rel="icon" href="static/favicon.ico" />
8974    <title>OMEGA Leaderboard</title>
8975    <script src="https://cdn.jsdelivr.net/npm/[email protected]/dist/vue.min.js"></script>
8976    <script src="https://cdn.jsdelivr.net/npm/axios/dist/axios.min.js"></script>
8977    <script src="https://cdn.jsdelivr.net/npm/chart.js"></script>
8978    <script src="https://cdn.jsdelivr.net/npm/[email protected]/dist/chartjs-adapter-date-fns.bundle.min.js"></script>
8979    <style>
8980        /* Apply base font styles */
8981        html, body {
8982            font-family: Roboto, sans-serif;
8983            line-height: 1.5;
8984            height: 101%; /* Ensure the html and body elements take up the full height of the window */
8985            margin: 0; /* Reset any default margin */
8986            padding: 0; /* Reset any default padding */
8987        }
8988
8989        body {
8990            font-size: 16px;
8991            line-height: 1.6;
8992            font-weight: 400;
8993            background-color: #0a1128;
8994            background-image: 
8995                linear-gradient(
8996                    to bottom,
8997                    rgba(255, 255, 255, 0) 0%, /* Fully transparent */
8998                    rgba(255, 255, 255, 0) calc(100% - 700px), /* Transparent until 200px from the bottom */
8999                    #0a1128 calc(100% - 200px),
9000                    #0a1128 100% /* Transition to the background color over the last 200px */
9001                ),
9002                url(https://omegatron.ai/static/images/0423e77f5905b1f1bccb.png);
9003            background-size: cover;
9004            background-repeat: no-repeat;
9005            background-position: center;
9006            color: #ffffff; /* Light text color for better readability */
9007        }
9008        /*
9009        body::before {
9010            position: absolute;
9011            content: "";
9012            width: 100%;
9013            height: 100%;
9014            top: 0;
9015            left: 0;
9016            background-image: linear-gradient(to bottom, #0a1128 0%, rgba(10, 17, 40, 0.8078431373) 30%, rgba(10, 17, 40, 0.5607843137) 60%, rgba(10, 17, 40, 0.1450980392) 95%) !important;
9017            z-index: 1;
9018        }*/
9019        .logo {
9020            display: block; /* Use block to apply margin auto for centering */
9021            width: 75px; /* Set the width of the logo container */
9022            height: 75px; /* Set the height of the logo container */
9023            margin: 0 auto; /* Center the logo horizontally */
9024            margin-top: 2rem; /* Add space above the logo */
9025        }
9026
9027        .logo svg {
9028            width: 100%; /* Make the SVG fill the container */
9029            height: 100%; /* Make the SVG fill the container */
9030        }
9031
9032        h1 {
9033            text-align: center;
9034            font-size: 2.5rem;
9035            margin-bottom: 3rem;
9036            margin-top: 0;
9037            text-shadow: 3px 3px 4px rgba(0, 0, 0, 0.75);
9038        }
9039
9040        /* Table styles */
9041        table {
9042            width: 90%;
9043            margin: 0 auto; /* Center table horizontally */
9044            border-collapse: collapse;
9045            text-indent: 0;
9046            color: #ffffff; /* Ensure table text is light-colored */
9047            border-radius: 10px; /* Rounded corners */
9048            box-shadow: 4px 4px 8px 0 rgba(70, 70, 70, 0.3); /* Drop shadow */
9049        }
9050
9051        /* not first child */
9052        tr:not(:first-child) {
9053            cursor: pointer;
9054            transition: background-color 0.3s ease, box-shadow 0.3s ease;
9055        }
9056        
9057        tr:not(:first-child):hover {
9058            background-color: rgba(98, 30, 100, 0.4); /* More translucent background color */
9059            box-shadow: 0 2px 8px rgba(0, 0, 0, 0.1);
9060        }
9061
9062        tr.graph-row:hover {
9063            background-color: #1f2a48; /* Dark background for headers */
9064        }
9065
9066        th.center {
9067            text-align: center;
9068        }
9069
9070        .width520 {
9071            width: 520px;
9072        }
9073
9074        .width20 {
9075            width: 20px;
9076        }
9077
9078        /* Style for table headers and cells to inherit the rounded corners */
9079        th, td {
9080            /*border: 1px solid #ddd;  Light gray border for cells */
9081            padding: 8px; /* Padding for cell content */
9082            text-align: left;
9083            width: 10%;
9084        }
9085
9086        th {
9087            background-color: #272727; /* Dark background for headers */
9088            color: #ffffff; /* Light text color for headers */
9089            font-weight: bold; /* Bold font weight for better readability */
9090        }
9091
9092        /* Style for the first and last cells in each row to inherit the rounded corners */
9093        th:first-child {
9094            border-top-left-radius: 10px; /* Top-left rounded corner */
9095        }
9096
9097        th:last-child {
9098            border-top-right-radius: 10px; /* Top-right rounded corner */
9099        }
9100
9101        /* Style for the last row to inherit the rounded corners */
9102        tr:last-child td:first-child {
9103            border-bottom-left-radius: 10px; /* Bottom-left rounded corner */
9104        }
9105
9106        tr:last-child td:last-child {
9107            border-bottom-right-radius: 10px; /* Bottom-right rounded corner */
9108        }
9109
9110        /* Body styles */
9111        tbody tr:nth-child(odd) {
9112            background-color: #162035; /* Dark background for odd rows */
9113        }
9114
9115        tbody tr:nth-child(even) {
9116            background-color: #1f2a48; /* Slightly different dark background for even rows */
9117        }
9118
9119        /* Footer styles */
9120        tfoot {
9121            font-weight: bold;
9122            background-color: #1f2a48; /* Consistent background for footer */
9123        }
9124
9125        .refresh-icon {
9126            cursor: pointer;
9127        }
9128
9129        .sortable {
9130            cursor: pointer;
9131        }
9132
9133        .arrow {
9134            display: inline-block;
9135            margin-left: 5px;
9136        }
9137
9138        .arrow-up::before {
9139            content: '▲';
9140        }
9141
9142        .arrow-down::before {
9143            content: '▼';
9144        }
9145
9146        input[type="text"] {
9147            width: 30%; /* Match the table width or adjust as needed */
9148            padding: 10px; /* Larger padding for a taller input field */
9149            margin-bottom: 20px; /* Space between the input field and the table */
9150            font-size: 16px; /* Larger font size for better readability */
9151            border: 1px solid #ccc; /* Subtle border color */
9152            border-radius: 5px; /* Slightly rounded corners */
9153            box-shadow: inset 0 1px 3px rgba(0, 0, 0, 0.1); /* Inner shadow for depth */
9154            display: block; /* Ensure it's a block-level element */
9155            margin-left: 11%; /* Combined with margin-right: auto, centers the input */
9156            margin-right: auto;
9157        }
9158
9159        .input-social-container {
9160            display: flex;
9161            align-items: center;
9162        }
9163
9164        .info-text {
9165            margin-left: 9em;
9166            margin-right: 0em;
9167        }
9168       
9169        .social-icons {
9170            position: absolute;
9171            right: 5%;
9172            display: flex;
9173            align-items: center;
9174        }
9175
9176        .social-icons button {
9177            background: none;
9178            border: none;
9179            cursor: pointer;
9180        }
9181
9182        .social-icon {
9183            display: flex;
9184            justify-content: center;
9185            align-items: center;
9186            width: 50px; /* Adjust size as needed */
9187            height: 50px; /* Adjust size as needed */
9188            border-radius: 50%; /* Make it circular */
9189            border: 1px solid #ccc; /* Light gray border */
9190            margin-left: 15px; /* Space between icons */
9191            overflow: hidden; /* Ensure the content fits the circular shape */
9192            margin-bottom: 2em;
9193        }
9194
9195        .social-icon img,
9196        .social-icon svg {
9197            width: 100%;
9198            height: 100%;
9199            display: block;
9200            object-fit: cover; /* Ensure the image covers the area */
9201        }
9202
9203        /* Responsive styles for smaller screens */
9204        @media (max-width: 768px) {
9205            body {
9206                font-size: 0.9em; /* Smaller font size on mobile */
9207            }
9208
9209            h1 {
9210                font-size: 1.5rem; /* Adjust heading size for mobile */
9211            }
9212
9213            .logo {
9214                width: 30%; /* Increase width percentage for smaller screens */
9215            }
9216
9217            input[type="text"] {
9218                width: 80%; /* Increase width for mobile */
9219                padding: 8px; /* Adjust padding */
9220                font-size: 1em; /* Adjust font size */
9221            }
9222
9223            table {
9224                width: 100%; /* Full width on mobile */
9225            }
9226        }
9227
9228        .chart-card {
9229            background-color: rgba(255, 255, 255, 0.05);
9230            border: 1px solid rgba(255, 255, 255, 0.1);
9231            border-radius: 10px;
9232            box-shadow: 0 4px 6px rgba(0, 0, 0, 0.1), 0 1px 3px rgba(0, 0, 0, 0.08);
9233            padding: 20px;
9234            margin: 20px auto;
9235            max-width: 88%;
9236            transition: transform 0.3s ease-in-out, box-shadow 0.3s ease-in-out;
9237            margin-bottom: 40px;
9238        }
9239
9240        .chart-card:hover {
9241            transform: translateY(-5px);
9242            box-shadow: 0 7px 14px rgba(0, 0, 0, 0.15), 0 3px 6px rgba(0, 0, 0, 0.1);
9243        }
9244
9245        .chart-title {
9246            color: #ffffff;
9247            font-size: 1.5rem;
9248            margin-bottom: 15px;
9249            text-align: center;
9250        }
9251
9252        .chart-container {
9253            position: relative;
9254            height: 40vh;
9255            width: 100%;
9256        }
9257
9258        /* Responsive adjustments */
9259        @media (max-width: 768px) {
9260            .chart-card {
9261                padding: 15px;
9262                margin: 15px auto;
9263            }
9264
9265            .chart-title {
9266                font-size: 1.2rem;
9267            }
9268
9269            .chart-container {
9270                height: 50vh;
9271            }
9272        }
9273
9274        .miner-chart-container {
9275            height: 300px;
9276            margin-top: 20px;
9277            margin-bottom: 20px;
9278        }
9279        .expanded-row {
9280            background-color: rgba(255, 255, 255, 0.05);
9281            transition: all 0.3s ease;
9282        }
9283        .expanded-content {
9284            padding: 20px;
9285        }
9286
9287        .focus-metrics {
9288            display: flex;
9289            justify-content: space-around;
9290            flex-wrap: wrap;
9291            margin-bottom: 20px;
9292        }
9293
9294        .focus-metric {
9295            background-color: rgba(255, 255, 255, 0.1);
9296            border-radius: 8px;
9297            padding: 15px;
9298            text-align: center;
9299            flex: 1;
9300            margin: 10px;
9301            min-width: 200px;
9302            backdrop-filter: blur(10px);
9303            -webkit-backdrop-filter: blur(10px);
9304        }
9305
9306        .focus-metric-title {
9307            font-size: 1.2rem;
9308            margin-bottom: 10px;
9309            color: #c0c0c0;
9310            font-weight: bold;
9311        }
9312
9313        .focus-metric-value {
9314            font-size: 1.5rem;
9315            font-weight: bold;
9316        }
9317
9318        .focus-metric-value-usd {
9319            font-size: 1.0rem;
9320            color: #aaaaaa;
9321        }
9322    </style>
9323</head>
9324<body>
9325    <div id="app">
9326        <div class="logo">
9327            <svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 75 75">
9328                <!-- Define the drop shadow filter -->
9329                <defs>
9330                    <filter id="text-shadow" x="-20%" y="-20%" width="140%" height="140%">
9331                        <feGaussianBlur in="SourceAlpha" stdDeviation="2" result="blur"/>
9332                        <feOffset in="blur" dx="2" dy="2" result="offsetBlur"/>
9333                        <feMerge>
9334                            <feMergeNode in="offsetBlur"/>
9335                            <feMergeNode in="SourceGraphic"/>
9336                        </feMerge>
9337                    </filter>
9338                </defs>
9339                <text x="50%" y="70%" dominant-baseline="middle" text-anchor="middle" font-family="Roboto" font-size="100" fill="#068AC7" filter="url(#text-shadow)">Ω</text>
9340            </svg>
9341        </div>
9342        <h1>OMEGA Leaderboard</h1>
9343        <div v-if="focusVideoData" class="chart-card">
9344            <h2 class="chart-title">Ω Focus KPIs</h2>
9345            <div class="focus-metrics">
9346                <div class="focus-metric">
9347                    <div class="focus-metric-title">TOTAL WALLETS</div>
9348                    <div class="focus-metric-value">{{ totalWallets }}</div>
9349                </div>
9350                <div class="focus-metric">
9351                    <div class="focus-metric-title">TOTAL TASKS DONE</div>
9352                    <div class="focus-metric-value">{{ totalVideosPurchased }}</div>
9353                </div>
9354                <div class="focus-metric">
9355                    <div class="focus-metric-title">TOTAL TAO BALANCE</div>
9356                    <div class="focus-metric-value">{{ totalTaoBalance.toFixed(3) }} <span class="focus-metric-value-usd">(${{ totalTaoBalanceUSD }} USD)</span></div>
9357                </div>
9358                <div class="focus-metric">
9359                    <div class="focus-metric-title">TOTAL TAO REVENUE</div>
9360                    <div class="focus-metric-value">{{ totalTaoRevenue.toFixed(3) }} <span class="focus-metric-value-usd">(${{ totalTaoRevenueUSD }} USD)</span></div>
9361                </div>
9362            </div>
9363            <div class="chart-container">
9364                <canvas ref="focusVideoChart"></canvas>
9365            </div>
9366        </div>
9367
9368        <div v-if="datasetSizeData" class="chart-card">
9369            <h2 class="chart-title">OMEGA Multimodal Dataset Size Over Time</h2>
9370            <div class="focus-metrics">
9371                <div class="focus-metric">
9372                    <div class="focus-metric-title">TOTAL ROWS</div>
9373                    <div class="focus-metric-value">{{ totalRows.toLocaleString() }}</div>
9374                </div>
9375                <div class="focus-metric">
9376                    <div class="focus-metric-title">MEMORY SIZE (GB)</div>
9377                    <div class="focus-metric-value">{{ memory_size_gb.toFixed(2) }}</div>
9378                </div>
9379            </div>
9380            <div class="chart-container">
9381                <canvas ref="datasetSizeChart"></canvas>
9382            </div>
9383        </div>
9384        
9385        <div class="input-social-container">
9386            <p class="info-text">Click on a row to display miner performance graph.</p>
9387            <input type="text" v-model="filterKey" placeholder="Filter by hotkey...">
9388            <div class="social-icons">
9389                <a href="https://twitter.com/omegalabsai" target="_blank" class="social-icon"><button class="" type="button"><span class=""><img src="https://omegatron.ai/static/images/16b3234e15bf0aece98c.png"></span></button></a>
9390                <a href="https://github.com/omegalabsinc" target="_blank" class="social-icon"><button class="" type="button"><span class=""><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" fill="none"><path fill="#fff" d="M12 2.247a10 10 0 0 0-3.162 19.487c.5.088.687-.212.687-.475 0-.237-.012-1.025-.012-1.862-2.513.462-3.163-.613-3.363-1.175a3.64 3.64 0 0 0-1.025-1.413c-.35-.187-.85-.65-.012-.662a2 2 0 0 1 1.537 1.025 2.137 2.137 0 0 0 2.913.825c.043-.509.27-.984.637-1.338-2.225-.25-4.55-1.112-4.55-4.937a3.9 3.9 0 0 1 1.025-2.688 3.6 3.6 0 0 1 .1-2.65s.837-.262 2.75 1.025a9.43 9.43 0 0 1 5 0c1.912-1.3 2.75-1.025 2.75-1.025.37.838.406 1.786.1 2.65a3.87 3.87 0 0 1 1.025 2.688c0 3.837-2.337 4.687-4.562 4.937a2.37 2.37 0 0 1 .675 1.85c0 1.338-.013 2.413-.013 2.75 0 .263.188.575.688.475A10.005 10.005 0 0 0 12 2.247"></path></svg></span></button></a>
9391            </div>
9392        </div>
9393        <table>
9394            <tr>
9395                <th class="center width520">Hotkey</th>
9396                <th>Project</th>
9397                <th class="sortable" @click="sortBy('datapoints')">Datapoints<span v-if="sortKey === 'datapoints'" class="arrow" :class="{ 'arrow-up': sortOrder > 0, 'arrow-down': sortOrder < 0 }"></span></th>
9398                <th class="sortable" @click="sortBy('avg_desc_relevance')">Avg Desc Relevance<span v-if="sortKey === 'avg_desc_relevance'" class="arrow" :class="{ 'arrow-up': sortOrder > 0, 'arrow-down': sortOrder < 0 }"></span></th>
9399                <th class="sortable" @click="sortBy('avg_query_relevance')">Avg Query Relevance<span v-if="sortKey === 'avg_query_relevance'" class="arrow" :class="{ 'arrow-up': sortOrder > 0, 'arrow-down': sortOrder < 0 }"></span></th>
9400                <th class="sortable" @click="sortBy('avg_score')">Avg Score<span v-if="sortKey === 'avg_score'" class="arrow" :class="{ 'arrow-up': sortOrder > 0, 'arrow-down': sortOrder < 0 }"></span></th>
9401                <th>Last Updated</th>
9402                <th class="width20"><span class="refresh-icon" @click="fetchData">↻</span></th>
9403            </tr>
9404            <template v-for="miner in filteredMiners">
9405                <tr :key="miner.hotkey" @click="toggleExpand(miner)" :class="{ 'expanded-row': miner.expanded }">
9406                    <td class="width520">{{ miner.hotkey }}</td>
9407                    <td>{{ miner.is_bittensor ? 'Bittensor' : 'Commune' }}</td>
9408                    <td>{{ miner.datapoints }}</td>
9409                    <td>{{ miner.avg_desc_relevance }}</td>
9410                    <td>{{ miner.avg_query_relevance }}</td>
9411                    <td>{{ miner.avg_score }}</td>
9412                    <td>{{ miner.last_updated }}</td>
9413                    <td class="width20"></td>
9414                </tr>
9415                <tr v-if="miner.expanded" :key="miner.hotkey + '-expanded'" class="graph-row">
9416                    <td colspan="8">
9417                        <div class="expanded-content">
9418                            <div class="miner-chart-container">
9419                                <canvas :ref="'minerChart-' + miner.hotkey"></canvas>
9420                            </div>
9421                        </div>
9422                    </td>
9423                </tr>
9424            </template>
9425        </table>
9426    </div>
9427    <div> </div>
9428
9429    <script>
9430        new Vue({
9431            el: '#app',
9432            data: {
9433                miners: [],
9434                filterKey: '',
9435                sortKey: '',
9436                sortOrder: 1,
9437                datasetSizeData: null,
9438                chartCreated: false,
9439                minerCharts: {},
9440                focusVideoData: null,
9441                totalWallets: 0,
9442                totalVideosPurchased: 0,
9443                totalTaoBalance: 0.0,
9444                totalTaoRevenue: 0.0,
9445                focusVideoChartCreated: false,
9446                taoPrice: 0,
9447            },
9448            computed: {
9449                filteredMiners() {
9450                    return this.miners.filter(miner => {
9451                        return miner.hotkey.toLowerCase().includes(this.filterKey.toLowerCase());
9452                    }).sort((a, b) => {
9453                        let modifier = this.sortOrder;
9454                        if(a[this.sortKey] < b[this.sortKey]) return -1 * modifier;
9455                        if(a[this.sortKey] > b[this.sortKey]) return 1 * modifier;
9456                        return 0;
9457                    });
9458                },
9459                totalTaoBalanceUSD() {
9460                    return (this.totalTaoBalance * this.taoPrice).toFixed(2);
9461                },
9462                totalTaoRevenueUSD() {
9463                    return (this.totalTaoRevenue * this.taoPrice).toFixed(2);
9464                }
9465            },
9466            watch: {
9467                datasetSizeData: {
9468                    handler(newData) {
9469                        console.log('Dataset size data updated:', newData);
9470                        this.$nextTick(() => {
9471                            this.createOrUpdateChart();
9472                        });
9473                    },
9474                    deep: true
9475                }
9476            },
9477            methods: {
9478                fetchData() {
9479                    axios.get('/api/leaderboard')
9480                        .then(response => {
9481                            this.miners = response.data;
9482                        })
9483                        .catch(error => {
9484                            console.error('There was an error fetching the leaderboard data:', error);
9485                        });
9486                },
9487                sortBy(key) {
9488                    if (this.sortKey === key) {
9489                        this.sortOrder *= -1;
9490                    } else {
9491                        this.sortKey = key;
9492                        this.sortOrder = 1;
9493                    }
9494                },
9495                fetchDatasetSizeData() {
9496                    console.log('Fetching dataset size data...');
9497                    axios.get('/api/leaderboard-dataset-data')
9498                        .then(response => {
9499                            console.log('Dataset size data received:', response.data);
9500                            const data = response.data;
9501                            
9502                            // Process the data to extract only what we need
9503                            this.datasetSizeData = {
9504                                labels: data.map(item => item.snapshot_date),
9505                                datasets: [
9506                                    {
9507                                        label: 'TOTAL ROWS',
9508                                        borderColor: '#98C379', // Green
9509                                        backgroundColor: '#98C379',
9510                                        data: data.map(item => ({
9511                                            x: item.snapshot_date,
9512                                            y: item.total_rows
9513                                        })),
9514                                        yAxisID: 'y'
9515                                    },
9516                                    {
9517                                        label: 'MEMORY SIZE (GB)',
9518                                        borderColor: '#61AFEF', // Blue
9519                                        backgroundColor: '#61AFEF',
9520                                        data: data.map(item => ({
9521                                            x: item.snapshot_date,
9522                                            y: item.memory_size_gb
9523                                        })),
9524                                        yAxisID: 'y1'
9525                                    }
9526                                ]
9527                            };
9528
9529                            this.totalRows = data[data.length - 1].total_rows;
9530                            this.memory_size_gb = data[data.length - 1].memory_size_gb;
9531                            
9532                            this.$nextTick(() => {
9533                                this.createOrUpdateChart();
9534                            });
9535                        })
9536                        .catch(error => {
9537                            console.error('There was an error fetching the dataset size data:', error);
9538                        });
9539                },
9540                createOrUpdateChart() {
9541                    const canvas = this.$refs.datasetSizeChart;
9542                    if (canvas && this.datasetSizeData && !this.chartCreated) {
9543                        const ctx = canvas.getContext('2d');
9544
9545                        this.chart = new Chart(ctx, {
9546                            type: 'line',
9547                            data: {
9548                                labels: this.datasetSizeData.labels,
9549                                datasets: this.datasetSizeData.datasets,
9550                            },
9551                            options: {
9552                                responsive: true,
9553                                maintainAspectRatio: false,
9554                                scales: {
9555                                    x: {
9556                                        type: 'time',
9557                                        time: {
9558                                            unit: 'month',
9559                                            displayFormats: {
9560                                                day: 'MMM yyyy'
9561                                            }
9562                                        },
9563                                        title: {
9564                                            display: true,
9565                                            text: 'DATE (UTC)',
9566                                            color: '#E0E0E0',
9567                                            font: {
9568                                                size: 14,
9569                                                weight: 'bold'
9570                                            }
9571                                        },
9572                                        ticks: {
9573                                            color: '#E0E0E0',
9574                                            font: {
9575                                                size: 12
9576                                            },
9577                                            source: 'data',
9578                                            maxRotation: 0,
9579                                            autoSkip: true,
9580                                            maxTicksLimit: 12
9581                                        }
9582                                    },
9583                                    y: {
9584                                        type: 'linear',
9585                                        display: true,
9586                                        position: 'left',
9587                                        title: {
9588                                            display: true,
9589                                            text: 'TOTAL ROWS',
9590                                            color: '#E0E0E0',
9591                                            font: {
9592                                                size: 14,
9593                                                weight: 'bold'
9594                                            }
9595                                        },
9596                                        ticks: {
9597                                            color: '#E0E0E0',
9598                                            font: {
9599                                                size: 12
9600                                            },
9601                                            callback: function(value, index, values) {
9602                                                return value.toLocaleString();
9603                                            }
9604                                        }
9605                                    },
9606                                    y1: {
9607                                        type: 'linear',
9608                                        display: true,
9609                                        position: 'right',
9610                                        title: {
9611                                            display: true,
9612                                            text: 'MEMORY SIZE (GB)',
9613                                            color: '#E0E0E0',
9614                                            font: {
9615                                                size: 14,
9616                                                weight: 'bold'
9617                                            }
9618                                        },
9619                                        ticks: {
9620                                            color: '#E0E0E0',
9621                                            font: {
9622                                                size: 12
9623                                            },
9624                                            callback: function(value, index, values) {
9625                                                return value.toLocaleString();
9626                                            }
9627                                        },
9628                                        grid: {
9629                                            drawOnChartArea: false
9630                                        }
9631                                    },
9632                                },
9633                                plugins: {
9634                                    legend: {
9635                                        labels: {
9636                                            color: '#E0E0E0',
9637                                            font: {
9638                                                size: 12
9639                                            }
9640                                        }
9641                                    },
9642                                    tooltip: {
9643                                        callbacks: {
9644                                            title: function(tooltipItems) {
9645                                                return tooltipItems[0].label + ' (UTC)';
9646                                            },
9647                                            label: function(context) {
9648                                                let label = context.dataset.label || '';
9649                                                if (label) {
9650                                                    label += ': ';
9651                                                }
9652                                                if (context.parsed.y !== null) {
9653                                                    label += context.parsed.y.toLocaleString();
9654                                                }
9655                                                return label;
9656                                            }
9657                                        }
9658                                    }
9659                                }
9660                            }
9661                        });
9662                        this.chartCreated = true;
9663                    } else if (this.chart && this.datasetSizeData) {
9664                        this.chart.data = this.datasetSizeData;
9665                        this.chart.update();
9666                    } else {
9667                        console.log('Unable to create or update chart. Canvas:', !!canvas, 'Data:', !!this.datasetSizeData, 'Chart created:', this.chartCreated);
9668                    }
9669                },
9670                toggleExpand(miner) {
9671                    this.$set(miner, 'expanded', !miner.expanded);
9672                    if (miner.expanded) {
9673                        this.fetchMinerData(miner.hotkey);
9674                    }
9675                },
9676                fetchMinerData(hotkey) {
9677                    axios.get(`/api/leaderboard-miner-data?hotkey=${hotkey}`)
9678                        .then(response => {
9679                            this.createMinerChart(hotkey, response.data);
9680                        })
9681                        .catch(error => {
9682                            console.error('Error fetching miner data:', error);
9683                        });
9684                },
9685                createMinerChart(hotkey, data) {
9686                    const canvas = this.$refs[`minerChart-${hotkey}`][0];
9687                    const ctx = canvas.getContext('2d');
9688
9689                    if (this.minerCharts[hotkey]) {
9690                        this.minerCharts[hotkey].destroy();
9691                    }
9692
9693                    this.minerCharts[hotkey] = new Chart(ctx, {
9694                        type: 'line',
9695                        data: {
9696                            labels: data.map(item => item.snapshot_date),
9697                            datasets: [
9698                                {
9699                                    label: 'Datapoints',
9700                                    borderColor: '#98C379', // Green
9701                                    backgroundColor: '#98C379',
9702                                    data: data.map(item => ({x: item.snapshot_date, y: item.datapoints})),
9703                                    yAxisID: 'y'
9704                                },
9705                                {
9706                                    label: 'Avg Score',
9707                                    borderColor: '#61AFEF', // Blue
9708                                    backgroundColor: '#61AFEF',
9709                                    data: data.map(item => ({x: item.snapshot_date, y: item.avg_score})),
9710                                    yAxisID: 'y1'
9711                                },
9712                                {
9713                                    label: 'Avg Query Relevance',
9714                                    borderColor: '#D19A66',
9715                                    backgroundColor: '#D19A66',
9716                                    data: data.map(item => ({x: item.snapshot_date, y: item.avg_query_relevance})),
9717                                    yAxisID: 'y1'
9718                                },
9719                                {
9720                                    label: 'Avg Desc Relevance',
9721                                    borderColor: '#C678DD',
9722                                    backgroundColor: '#C678DD',
9723                                    data: data.map(item => ({x: item.snapshot_date, y: item.avg_desc_relevance})),
9724                                    yAxisID: 'y1'
9725                                },
9726                                {
9727                                    label: 'Incentive',
9728                                    borderColor: '#E06C75',
9729                                    backgroundColor: '#E06C75',
9730                                    data: data.map(item => ({x: item.snapshot_date, y: item.incentive})),
9731                                    yAxisID: 'y2'
9732                                }
9733                            ]
9734                        },
9735                        options: {
9736                            responsive: true,
9737                            maintainAspectRatio: false,
9738                            scales: {
9739                                x: {
9740                                    type: 'time',
9741                                    time: {
9742                                        parser: 'yyyy-MM-dd',
9743                                        unit: 'day',
9744                                        displayFormats: {
9745                                            day: 'MMM d, yyyy'
9746                                        }
9747                                    },
9748                                    title: {
9749                                        display: true,
9750                                        text: 'DATE (UTC)',
9751                                        color: '#E0E0E0',
9752                                        font: {
9753                                            size: 14,
9754                                            weight: 'bold'
9755                                        }
9756                                    },
9757                                    ticks: {
9758                                        color: '#E0E0E0',
9759                                        font: {
9760                                            size: 12
9761                                        }
9762                                    }
9763                                },
9764                                y: {
9765                                    type: 'linear',
9766                                    display: true,
9767                                    position: 'left',
9768                                    title: {
9769                                        display: true,
9770                                        text: 'DATAPOINTS',
9771                                        color: '#E0E0E0',
9772                                        font: {
9773                                            size: 14,
9774                                            weight: 'bold'
9775                                        }
9776                                    },
9777                                    ticks: {
9778                                        color: '#E0E0E0',
9779                                        font: {
9780                                            size: 12
9781                                        },
9782                                        callback: function(value) {
9783                                            return value.toLocaleString();
9784                                        }
9785                                    }
9786                                },
9787                                y1: {
9788                                    type: 'linear',
9789                                    display: true,
9790                                    position: 'right',
9791                                    min: 0,
9792                                    max: 1,
9793                                    title: {
9794                                        display: true,
9795                                        text: 'SCORES',
9796                                        color: '#E0E0E0',
9797                                        font: {
9798                                            size: 14,
9799                                            weight: 'bold'
9800                                        }
9801                                    },
9802                                    ticks: {
9803                                        color: '#E0E0E0',
9804                                        font: {
9805                                            size: 12
9806                                        }
9807                                    },
9808                                    grid: {
9809                                        drawOnChartArea: false
9810                                    }
9811                                },
9812                                y2: {
9813                                    type: 'linear',
9814                                    display: false,
9815                                    position: 'right',
9816                                    grid: {
9817                                        drawOnChartArea: false,
9818                                    },
9819                                    ticks: {
9820                                        callback: function(value, index, values) {
9821                                            return value.toFixed(5);
9822                                        }
9823                                    }
9824                                },
9825                            },
9826                            plugins: {
9827                                legend: {
9828                                    labels: {
9829                                        color: '#E0E0E0',
9830                                        font: {
9831                                            size: 12
9832                                        }
9833                                    }
9834                                },
9835                                tooltip: {
9836                                    callbacks: {
9837                                        title: function(tooltipItems) {
9838                                            return tooltipItems[0].label + ' (UTC)';
9839                                        },
9840                                        label: function(context) {
9841                                            let label = context.dataset.label || '';
9842                                            if (label) {
9843                                                label += ': ';
9844                                            }
9845                                            if (context.parsed.y !== null) {
9846                                                if (label.startsWith('Incentive')) {
9847                                                    // Display incentive with 5 decimal places
9848                                                    label += context.parsed.y.toFixed(5);
9849                                                } else {
9850                                                    // For other metrics, use the existing formatting
9851                                                    label += context.parsed.y.toLocaleString();
9852                                                }
9853                                            }
9854                                            return label;
9855                                        }
9856                                    }
9857                                }
9858                            }
9859                        }
9860                    });
9861
9862                },
9863                fetchFocusVideoData() {
9864                    console.log('Fetching focus video data...');
9865                    axios.get('/api/leaderboard-focus-data')
9866                        .then(response => {
9867                            console.log('Focus video data received:', response.data);
9868                            const data = response.data;
9869                            
9870                            // Process the data
9871                            this.focusVideoData = {
9872                                labels: data.map(item => item.snapshot_date),
9873                                datasets: [
9874                                    {
9875                                        label: 'Total Wallets',
9876                                        borderColor: '#98C379', // Green
9877                                        backgroundColor: '#98C379',
9878                                        data: data.map(item => ({
9879                                            x: item.snapshot_date,
9880                                            y: item.total_wallets
9881                                        })),
9882                                        yAxisID: 'y2'
9883                                    },
9884                                    {
9885                                        label: 'Total Tasks Done',
9886                                        borderColor: '#61AFEF', // Blue
9887                                        backgroundColor: '#61AFEF',
9888                                        data: data.map(item => ({
9889                                            x: item.snapshot_date,
9890                                            y: item.total_videos_purchased
9891                                        })),
9892                                        yAxisID: 'y'
9893                                    },
9894                                    {
9895                                        label: 'Total TAO Balance',
9896                                        borderColor: '#C678DD', // Purple
9897                                        backgroundColor: '#C678DD',
9898                                        data: data.map(item => ({
9899                                            x: item.snapshot_date,
9900                                            y: item.total_tao_balance
9901                                        })),
9902                                        yAxisID: 'y1'
9903                                    },
9904                                    {
9905                                        label: 'Total TAO Revenue',
9906                                        borderColor: '#E06C75', // Red
9907                                        backgroundColor: '#E06C75',
9908                                        data: data.map(item => ({
9909                                            x: item.snapshot_date,
9910                                            y: item.total_tao_revenue
9911                                        })),
9912                                        yAxisID: 'y1'
9913                                    }
9914                                ]
9915                            };
9916
9917                            this.totalWallets = data[data.length - 1].total_wallets;
9918                            this.totalVideosPurchased = data[data.length - 1].total_videos_purchased;
9919                            this.totalTaoBalance = data[data.length - 1].total_tao_balance;
9920                            this.totalTaoRevenue = data[data.length - 1].total_tao_revenue;
9921                            
9922                            this.$nextTick(() => {
9923                                this.createOrUpdateFocusVideoChart();
9924                            });
9925                        })
9926                        .catch(error => {
9927                            console.error('There was an error fetching the focus video data:', error);
9928                        });
9929                },
9930                createOrUpdateFocusVideoChart() {
9931                    const canvas = this.$refs.focusVideoChart;
9932                    if (canvas && this.focusVideoData && !this.focusVideoChartCreated) {
9933                        const ctx = canvas.getContext('2d');
9934
9935                        this.focusVideoChart = new Chart(ctx, {
9936                            type: 'line',
9937                            data: this.focusVideoData,
9938                            options: {
9939                                responsive: true,
9940                                maintainAspectRatio: false,
9941                                scales: {
9942                                    x: {
9943                                        type: 'time',
9944                                        time: {
9945                                            unit: 'month',
9946                                            displayFormats: {
9947                                                month: 'MMM yyyy'
9948                                            }
9949                                        },
9950                                        title: {
9951                                            display: true,
9952                                            text: 'DATE',
9953                                            color: '#E0E0E0',
9954                                            font: {
9955                                                size: 14,
9956                                                weight: 'bold'
9957                                            }
9958                                        },
9959                                        ticks: {
9960                                            color: '#E0E0E0',
9961                                            font: {
9962                                                size: 12
9963                                            },
9964                                            maxRotation: 0,
9965                                            autoSkip: true,
9966                                            maxTicksLimit: 12
9967                                        }
9968                                    },
9969                                    y2: {
9970                                        type: 'linear',
9971                                        display: false, // Hide/Show the y2-axis
9972                                        position: 'left',
9973                                        title: {
9974                                            display: false,
9975                                            text: 'TOTAL WALLETS'
9976                                        },
9977                                        ticks: {
9978                                            color: '#98C379',
9979                                            callback: function(value) {
9980                                                return value >= 1000 ? value / 1000 + 'K' : value;
9981                                            }
9982                                        },
9983                                        grid: {
9984                                            drawOnChartArea: false
9985                                        }
9986                                    },
9987                                    y: {
9988                                        type: 'linear',
9989                                        display: false, // Hide/Show the y-axis
9990                                        position: 'left',
9991                                        title: {
9992                                            display: true,
9993                                            text: 'COUNT',
9994                                            color: '#E0E0E0',
9995                                            font: {
9996                                                size: 14,
9997                                                weight: 'bold'
9998                                            }
9999                                        },
10000                                        ticks: {
10001                                            color: '#E0E0E0',
10002                                            font: {
10003                                                size: 12
10004                                            },
10005                                            callback: function(value) {
10006                                                return value.toLocaleString();
10007                                            }
10008                                        }
10009                                    },
10010                                    y1: {
10011                                        type: 'linear',
10012                                        display: true,
10013                                        position: 'right',
10014                                        title: {
10015                                            display: true,
10016                                            text: 'TAO',
10017                                            color: '#E0E0E0',
10018                                            font: {
10019                                                size: 14,
10020                                                weight: 'bold'
10021                                            }
10022                                        },
10023                                        ticks: {
10024                                            color: '#E0E0E0',
10025                                            font: {
10026                                                size: 12
10027                                            },
10028                                            callback: function(value) {
10029                                                return value.toFixed(2);
10030                                            }
10031                                        },
10032                                        grid: {
10033                                            drawOnChartArea: false
10034                                        }
10035                                    },
10036                                },
10037                                plugins: {
10038                                    legend: {
10039                                        labels: {
10040                                            color: '#E0E0E0',
10041                                            font: {
10042                                                size: 12
10043                                            }
10044                                        }
10045                                    },
10046                                    tooltip: {
10047                                        callbacks: {
10048                                            title: function(tooltipItems) {
10049                                                return tooltipItems[0].label.replace(', 12:00:00 a.m.', '');
10050                                            },
10051                                            label: function(context) {
10052                                                let label = context.dataset.label || '';
10053                                                if (label) {
10054                                                    label += ': ';
10055                                                }
10056                                                if (context.parsed.y !== null) {
10057                                                    if (label.includes('TAO')) {
10058                                                        label += context.parsed.y.toFixed(3);
10059                                                    } else {
10060                                                        label += context.parsed.y.toLocaleString();
10061                                                    }
10062                                                }
10063                                                return label;
10064                                            }
10065                                        }
10066                                    }
10067                                }
10068                            }
10069                        });
10070                        this.focusVideoChartCreated = true;
10071                    } else if (this.focusVideoChart && this.focusVideoData) {
10072                        this.focusVideoChart.data = this.focusVideoData;
10073                        this.focusVideoChart.update();
10074                    } else {
10075                        console.log('Unable to create or update focus video chart. Canvas:', !!canvas, 'Data:', !!this.focusVideoData, 'Chart created:', this.focusVideoChartCreated);
10076                    }
10077                },
10078                fetchTaoPrice() {
10079                    axios.get('https://focus-api.omegatron.ai/get_tao_price')
10080                        .then(response => {
10081                            this.taoPrice = response.data;
10082                        })
10083                        .catch(error => {
10084                            console.error('Error fetching TAO price:', error);
10085                        });
10086                },
10087            },
10088            mounted() {
10089                this.fetchData();
10090                this.fetchDatasetSizeData();
10091                this.fetchFocusVideoData();
10092                this.fetchTaoPrice();
10093            }
10094        });
10095    </script>
10096</body>
10097</html>
10098
10099
10100---
10101File: /validator-api/validator_api/communex/_common.py
10102---
10103
10104import random
10105
10106class ComxSettings():
10107    # TODO: improve node lists
10108    NODE_URLS: list[str] = [
10109        "wss://commune-api-node-0.communeai.net",
10110        "wss://commune-api-node-1.communeai.net",
10111        "wss://commune-api-node-2.communeai.net",
10112        "wss://commune-api-node-3.communeai.net",
10113        "wss://commune-api-node-4.communeai.net",
10114        "wss://commune-api-node-5.communeai.net",
10115        "wss://commune-api-node-6.communeai.net",
10116        "wss://commune-api-node-7.communeai.net",
10117        "wss://commune-api-node-8.communeai.net",
10118        "wss://commune-api-node-9.communeai.net",
10119        "wss://commune-api-node-10.communeai.net",
10120        "wss://commune-api-node-11.communeai.net",
10121        "wss://commune-api-node-12.communeai.net",
10122        "wss://commune-api-node-13.communeai.net",
10123        "wss://commune-api-node-14.communeai.net",
10124        "wss://commune-api-node-15.communeai.net",
10125        "wss://commune-api-node-16.communeai.net",
10126        "wss://commune-api-node-17.communeai.net",
10127        "wss://commune-api-node-18.communeai.net",
10128        "wss://commune-api-node-19.communeai.net",
10129        "wss://commune-api-node-20.communeai.net",
10130        "wss://commune-api-node-21.communeai.net",
10131        "wss://commune-api-node-22.communeai.net",
10132        "wss://commune-api-node-23.communeai.net",
10133        "wss://commune-api-node-24.communeai.net",
10134        "wss://commune-api-node-25.communeai.net",
10135        "wss://commune-api-node-26.communeai.net",
10136        "wss://commune-api-node-27.communeai.net",
10137        "wss://commune-api-node-28.communeai.net",
10138        "wss://commune-api-node-29.communeai.net",
10139        "wss://commune-api-node-30.communeai.net",
10140        "wss://commune-api-node-31.communeai.net",
10141    ]
10142    TESTNET_NODE_URLS: list[str] = [
10143        "wss://testnet-commune-api-node-0.communeai.net"]
10144
10145
10146def get_node_url(
10147    comx_settings: ComxSettings | None = None, *, use_testnet: bool = False
10148) -> str:
10149    comx_settings = comx_settings or ComxSettings()
10150    match use_testnet:
10151        case True:
10152            node_url = random.choice(comx_settings.TESTNET_NODE_URLS)
10153        case False:
10154            node_url = random.choice(comx_settings.NODE_URLS)
10155    return node_url
10156
10157
10158---
10159File: /validator-api/validator_api/communex/client.py
10160---
10161
10162import json
10163import queue
10164from concurrent.futures import Future, ThreadPoolExecutor
10165from contextlib import contextmanager
10166from copy import deepcopy
10167from dataclasses import dataclass
10168from typing import Any, Mapping, TypeVar
10169
10170from substrateinterface import ExtrinsicReceipt  # type: ignore
10171from substrateinterface import Keypair  # type: ignore
10172from substrateinterface import SubstrateInterface  # type: ignore
10173from substrateinterface.storage import StorageKey  # type: ignore
10174
10175from validator_api.communex.errors import ChainTransactionError, NetworkQueryError
10176from validator_api.communex.types import NetworkParams, Ss58Address, SubnetParams
10177
10178# TODO: InsufficientBalanceError, MismatchedLengthError etc
10179
10180MAX_REQUEST_SIZE = 9_000_000
10181
10182
10183@dataclass
10184class Chunk:
10185    batch_requests: list[tuple[Any, Any]]
10186    prefix_list: list[list[str]]
10187    fun_params: list[tuple[Any, Any, Any, Any, str]]
10188
10189
10190T1 = TypeVar("T1")
10191T2 = TypeVar("T2")
10192
10193
10194class CommuneClient:
10195    """
10196    A client for interacting with Commune network nodes, querying storage,
10197    submitting transactions, etc.
10198
10199    Attributes:
10200        wait_for_finalization: Whether to wait for transaction finalization.
10201
10202    Example:
10203    ```py
10204    client = CommuneClient()
10205    client.query(name='function_name', params=['param1', 'param2'])
10206    ```
10207
10208    Raises:
10209        AssertionError: If the maximum connections value is less than or equal
10210          to zero.
10211    """
10212
10213    wait_for_finalization: bool
10214    _num_connections: int
10215    _connection_queue: queue.Queue[SubstrateInterface]
10216
10217    def __init__(
10218        self,
10219        url: str,
10220        num_connections: int = 1,
10221        wait_for_finalization: bool = False,
10222    ):
10223        """
10224        Args:
10225            url: The URL of the network node to connect to.
10226            num_connections: The number of websocket connections to be opened.
10227        """
10228        assert num_connections > 0
10229        self._num_connections = num_connections
10230        self.wait_for_finalization = wait_for_finalization
10231        self._connection_queue = queue.Queue(num_connections)
10232
10233        for _ in range(num_connections):
10234            self._connection_queue.put(SubstrateInterface(url))
10235
10236    @property
10237    def connections(self) -> int:
10238        """
10239        Gets the maximum allowed number of simultaneous connections to the
10240        network node.
10241        """
10242        return self._num_connections
10243
10244    @contextmanager
10245    def get_conn(self, timeout: float | None = None, init: bool = False):
10246        """
10247        Context manager to get a connection from the pool.
10248
10249        Tries to get a connection from the pool queue. If the queue is empty,
10250        it blocks for `timeout` seconds until a connection is available. If
10251        `timeout` is None, it blocks indefinitely.
10252
10253        Args:
10254            timeout: The maximum time in seconds to wait for a connection.
10255
10256        Yields:
10257            The connection object from the pool.
10258
10259        Raises:
10260            QueueEmptyError: If no connection is available within the timeout
10261              period.
10262        """
10263        conn = self._connection_queue.get(timeout=timeout)
10264        if init:
10265            conn.init_runtime()  # type: ignore
10266        try:
10267            yield conn
10268        finally:
10269            self._connection_queue.put(conn)
10270
10271    def _get_storage_keys(
10272        self,
10273        storage: str,
10274        queries: list[tuple[str, list[Any]]],
10275        block_hash: str | None,
10276    ):
10277
10278        send: list[tuple[str, list[Any]]] = []
10279        prefix_list: list[Any] = []
10280
10281        key_idx = 0
10282        with self.get_conn(init=True) as substrate:
10283            for function, params in queries:
10284                storage_key = StorageKey.create_from_storage_function(  # type: ignore
10285                    storage, function, params, runtime_config=substrate.runtime_config, metadata=substrate.metadata  # type: ignore
10286                )
10287
10288                prefix = storage_key.to_hex()
10289                prefix_list.append(prefix)
10290                send.append(("state_getKeys", [prefix, block_hash]))
10291                key_idx += 1
10292        return send, prefix_list
10293
10294    def _get_lists(
10295        self,
10296        storage_module: str,
10297        queries: list[tuple[str, list[Any]]],
10298        substrate: SubstrateInterface,
10299    ) -> list[tuple[Any, Any, Any, Any, str]]:
10300        """
10301        Generates a list of tuples containing parameters for each storage function based on the given functions and substrate interface.
10302
10303        Args:
10304            functions (dict[str, list[query_call]]): A dictionary where keys are storage module names and values are lists of tuples.
10305                Each tuple consists of a storage function name and its parameters.
10306            substrate: An instance of the SubstrateInterface class used to interact with the substrate.
10307
10308        Returns:
10309            A list of tuples in the format `(value_type, param_types, key_hashers, params, storage_function)` for each storage function in the given functions.
10310
10311        Example:
10312            >>> _get_lists(
10313                    functions={'storage_module': [('storage_function', ['param1', 'param2'])]},
10314                    substrate=substrate_instance
10315                )
10316            [('value_type', 'param_types', 'key_hashers', ['param1', 'param2'], 'storage_function'), ...]
10317        """
10318
10319        function_parameters: list[tuple[Any, Any, Any, Any, str]] = []
10320        metadata_pallet = substrate.metadata.get_metadata_pallet(  #  type: ignore
10321            storage_module)  # type: ignore
10322        for storage_function, params in queries:
10323            storage_item = metadata_pallet.get_storage_function(  #  type: ignore
10324                storage_function)  # type: ignore
10325            value_type = storage_item.get_value_type_string()  # type: ignore
10326            param_types = storage_item.get_params_type_string()  # type: ignore
10327            key_hashers = storage_item.get_param_hashers()  # type: ignore
10328            function_parameters.append(
10329                (value_type, param_types, key_hashers,
10330                 params, storage_function)  # type: ignore
10331            )
10332        return function_parameters
10333
10334    def _send_batch(
10335        self,
10336        batch_payload: list[Any],
10337        request_ids: list[int],
10338        extract_result: bool = True,
10339    ):
10340        """
10341        Sends a batch of requests to the substrate and collects the results.
10342
10343        Args:
10344            substrate: An instance of the substrate interface.
10345            batch_payload: The payload of the batch request.
10346            request_ids: A list of request IDs for tracking responses.
10347            results: A list to store the results of the requests.
10348            extract_result: Whether to extract the result from the response.
10349
10350        Raises:
10351            NetworkQueryError: If there is an `error` in the response message.
10352
10353        Note:
10354            No explicit return value as results are appended to the provided 'results' list.
10355        """
10356        results: list[str | dict[Any, Any]] = []
10357        with self.get_conn(init=True) as substrate:
10358            try:
10359                substrate.websocket.send(  #  type: ignore
10360                    json.dumps(batch_payload))  # type: ignore
10361            except NetworkQueryError:
10362                pass
10363            while len(results) < len(request_ids):
10364                received_messages = json.loads(
10365                    substrate.websocket.recv())  # type: ignore
10366                if isinstance(received_messages, dict):
10367                    received_messages: list[dict[Any, Any]] = [
10368                        received_messages]
10369
10370                for message in received_messages:
10371                    if message.get("id") in request_ids:
10372                        if extract_result:
10373                            try:
10374                                results.append(message["result"])
10375                            except Exception:
10376                                raise (
10377                                    RuntimeError(
10378                                        f"Error extracting result from message: {message}"
10379                                    )
10380                                )
10381                        else:
10382                            results.append(message)
10383                    if "error" in message:
10384                        raise NetworkQueryError(message["error"])
10385
10386            return results
10387
10388    def _make_request_smaller(
10389        self,
10390        batch_request: list[tuple[T1, T2]],
10391        prefix_list: list[list[str]],
10392        fun_params: list[tuple[Any, Any, Any, Any, str]],
10393    ) -> tuple[list[list[tuple[T1, T2]]], list[Chunk]]:
10394        """
10395        Splits a batch of requests into smaller batches, each not exceeding the specified maximum size.
10396
10397        Args:
10398            batch_request: A list of requests to be sent in a batch.
10399            max_size: Maximum size of each batch in bytes.
10400
10401        Returns:
10402            A list of smaller request batches.
10403
10404        Example:
10405            >>> _make_request_smaller(batch_request=[('method1', 'params1'), ('method2', 'params2')], max_size=1000)
10406            [[('method1', 'params1')], [('method2', 'params2')]]
10407        """
10408        assert len(prefix_list) == len(fun_params) == len(batch_request)
10409
10410        def estimate_size(request: tuple[T1, T2]):
10411            """Convert the batch request to a string and measure its length"""
10412            return len(json.dumps(request))
10413
10414        # Initialize variables
10415        result: list[list[tuple[T1, T2]]] = []
10416        current_batch = []
10417        current_prefix_batch = []
10418        current_params_batch = []
10419        current_size = 0
10420
10421        chunk_list: list[Chunk] = []
10422
10423        # Iterate through each request in the batch
10424        for request, prefix, params in zip(batch_request, prefix_list, fun_params):
10425            request_size = estimate_size(request)
10426
10427            # Check if adding this request exceeds the max size
10428            if current_size + request_size > MAX_REQUEST_SIZE:
10429                # If so, start a new batch
10430
10431                # Essentiatly checks that it's not the first iteration
10432                if current_batch:
10433                    chunk = Chunk(
10434                        current_batch, current_prefix_batch, current_params_batch
10435                    )
10436                    chunk_list.append(chunk)
10437                    result.append(current_batch)
10438
10439                current_batch = [request]
10440                current_prefix_batch = [prefix]
10441                current_params_batch = [params]
10442                current_size = request_size
10443            else:
10444                # Otherwise, add to the current batch
10445                current_batch.append(request)
10446                current_size += request_size
10447                current_prefix_batch.append(prefix)
10448                current_params_batch.append(params)
10449
10450        # Add the last batch if it's not empty
10451        if current_batch:
10452            result.append(current_batch)
10453            chunk = Chunk(current_batch, current_prefix_batch,
10454                          current_params_batch)
10455            chunk_list.append(chunk)
10456
10457        return result, chunk_list
10458
10459    def _are_changes_equal(self, change_a: Any, change_b: Any):
10460        for (a, b), (c, d) in zip(change_a, change_b):
10461            if a != c or b != d:
10462                return False
10463
10464    def _rpc_request_batch(
10465        self, batch_requests: list[tuple[str, list[Any]]], extract_result: bool = True
10466    ) -> list[str]:
10467        """
10468        Sends batch requests to the substrate node using multiple threads and collects the results.
10469
10470        Args:
10471            substrate: An instance of the substrate interface.
10472            batch_requests : A list of requests to be sent in batches.
10473            max_size: Maximum size of each batch in bytes.
10474            extract_result: Whether to extract the result from the response message.
10475
10476        Returns:
10477            A list of results from the batch requests.
10478
10479        Example:
10480            >>> _rpc_request_batch(substrate_instance, [('method1', ['param1']), ('method2', ['param2'])])
10481            ['result1', 'result2', ...]
10482        """
10483
10484        chunk_results: list[Any] = []
10485        # smaller_requests = self._make_request_smaller(batch_requests)
10486        request_id = 0
10487        with ThreadPoolExecutor() as executor:
10488            futures: list[Future[list[str | dict[Any, Any]]]] = []
10489            for chunk in [batch_requests]:
10490                request_ids: list[int] = []
10491                batch_payload: list[Any] = []
10492                for method, params in chunk:
10493                    request_id += 1
10494                    request_ids.append(request_id)
10495                    batch_payload.append(
10496                        {
10497                            "jsonrpc": "2.0",
10498                            "method": method,
10499                            "params": params,
10500                            "id": request_id,
10501                        }
10502                    )
10503
10504                futures.append(
10505                    executor.submit(
10506                        self._send_batch,
10507                        batch_payload=batch_payload,
10508                        request_ids=request_ids,
10509                        extract_result=extract_result,
10510                    )
10511                )
10512            for future in futures:
10513                resul = future.result()
10514                chunk_results.append(resul)
10515        return chunk_results
10516
10517    def _rpc_request_batch_chunked(
10518        self, chunk_requests: list[Chunk], extract_result: bool = True
10519    ):
10520        """
10521        Sends batch requests to the substrate node using multiple threads and collects the results.
10522
10523        Args:
10524            substrate: An instance of the substrate interface.
10525            batch_requests : A list of requests to be sent in batches.
10526            max_size: Maximum size of each batch in bytes.
10527            extract_result: Whether to extract the result from the response message.
10528
10529        Returns:
10530            A list of results from the batch requests.
10531
10532        Example:
10533            >>> _rpc_request_batch(substrate_instance, [('method1', ['param1']), ('method2', ['param2'])])
10534            ['result1', 'result2', ...]
10535        """
10536
10537        def split_chunks(chunk: Chunk, chunk_info: list[Chunk], chunk_info_idx: int):
10538            manhattam_chunks: list[tuple[Any, Any]] = []
10539            mutaded_chunk_info = deepcopy(chunk_info)
10540            max_n_keys = 35000
10541            for query in chunk.batch_requests:
10542                result_keys = query[1][0]
10543                keys_amount = len(result_keys)
10544                if keys_amount > max_n_keys:
10545                    mutaded_chunk_info.pop(chunk_info_idx)
10546                    for i in range(0, keys_amount, max_n_keys):
10547                        new_chunk = deepcopy(chunk)
10548                        splitted_keys = result_keys[i: i + max_n_keys]
10549                        splitted_query = deepcopy(query)
10550                        splitted_query[1][0] = splitted_keys
10551                        new_chunk.batch_requests = [splitted_query]
10552                        manhattam_chunks.append(splitted_query)
10553                        mutaded_chunk_info.insert(chunk_info_idx, new_chunk)
10554                else:
10555                    manhattam_chunks.append(query)
10556            return manhattam_chunks, mutaded_chunk_info
10557
10558        assert len(chunk_requests) > 0
10559        mutated_chunk_info: list[Chunk] = []
10560        chunk_results: list[Any] = []
10561        # smaller_requests = self._make_request_smaller(batch_requests)
10562        request_id = 0
10563
10564        with ThreadPoolExecutor() as executor:
10565            futures: list[Future[list[str | dict[Any, Any]]]] = []
10566            for idx, macro_chunk in enumerate(chunk_requests):
10567                _, mutated_chunk_info = split_chunks(
10568                    macro_chunk, chunk_requests, idx)
10569            for chunk in mutated_chunk_info:
10570                request_ids: list[int] = []
10571                batch_payload: list[Any] = []
10572                for method, params in chunk.batch_requests:
10573                    # for method, params in micro_chunk:
10574                    request_id += 1
10575                    request_ids.append(request_id)
10576                    batch_payload.append(
10577                        {
10578                            "jsonrpc": "2.0",
10579                            "method": method,
10580                            "params": params,
10581                            "id": request_id,
10582                        }
10583                    )
10584                futures.append(
10585                    executor.submit(
10586                        self._send_batch,
10587                        batch_payload=batch_payload,
10588                        request_ids=request_ids,
10589                        extract_result=extract_result,
10590                    )
10591                )
10592            for future in futures:
10593                resul = future.result()
10594                chunk_results.append(resul)
10595        return chunk_results, mutated_chunk_info
10596
10597    def _decode_response(
10598        self,
10599        response: list[str],
10600        function_parameters: list[tuple[Any, Any, Any, Any, str]],
10601        prefix_list: list[Any],
10602        block_hash: str,
10603    ) -> dict[str, dict[Any, Any]]:
10604        """
10605        Decodes a response from the substrate interface and organizes the data into a dictionary.
10606
10607        Args:
10608            response: A list of encoded responses from a substrate query.
10609            function_parameters: A list of tuples containing the parameters for each storage function.
10610            last_keys: A list of the last keys used in the substrate query.
10611            prefix_list: A list of prefixes used in the substrate query.
10612            substrate: An instance of the SubstrateInterface class.
10613            block_hash: The hash of the block to be queried.
10614
10615        Returns:
10616            A dictionary where each key is a storage function name and the value is another dictionary.
10617            This inner dictionary's key is the decoded key from the response and the value is the corresponding decoded value.
10618
10619        Raises:
10620            ValueError: If an unsupported hash type is encountered in the `concat_hash_len` function.
10621
10622        Example:
10623            >>> _decode_response(
10624                    response=[...],
10625                    function_parameters=[...],
10626                    last_keys=[...],
10627                    prefix_list=[...],
10628                    substrate=substrate_instance,
10629                    block_hash="0x123..."
10630                )
10631            {'storage_function_name': {decoded_key: decoded_value, ...}, ...}
10632        """
10633
10634        def concat_hash_len(key_hasher: str) -> int:
10635            """
10636            Determines the length of the hash based on the given key hasher type.
10637
10638            Args:
10639                key_hasher: The type of key hasher.
10640
10641            Returns:
10642                The length of the hash corresponding to the given key hasher type.
10643
10644            Raises:
10645                ValueError: If the key hasher type is not supported.
10646
10647            Example:
10648                >>> concat_hash_len("Blake2_128Concat")
10649                16
10650            """
10651
10652            if key_hasher == "Blake2_128Concat":
10653                return 16
10654            elif key_hasher == "Twox64Concat":
10655                return 8
10656            elif key_hasher == "Identity":
10657                return 0
10658            else:
10659                raise ValueError("Unsupported hash type")
10660
10661        assert len(response) == len(function_parameters) == len(prefix_list)
10662        result_dict: dict[str, dict[Any, Any]] = {}
10663        for res, fun_params_tuple, prefix in zip(
10664            response, function_parameters, prefix_list
10665        ):
10666            if not res:
10667                continue
10668            res = res[0]
10669            changes = res["changes"]  # type: ignore
10670            value_type, param_types, key_hashers, params, storage_function = (
10671                fun_params_tuple
10672            )
10673            with self.get_conn(init=True) as substrate:
10674                for item in changes:
10675                    # Determine type string
10676                    key_type_string: list[Any] = []
10677                    for n in range(len(params), len(param_types)):
10678                        key_type_string.append(
10679                            f"[u8; {concat_hash_len(key_hashers[n])}]"
10680                        )
10681                        key_type_string.append(param_types[n])
10682
10683                    item_key_obj = substrate.decode_scale(  # type: ignore
10684                        type_string=f"({', '.join(key_type_string)})",
10685                        scale_bytes="0x" + item[0][len(prefix):],
10686                        return_scale_obj=True,
10687                        block_hash=block_hash,
10688                    )
10689                    # strip key_hashers to use as item key
10690                    if len(param_types) - len(params) == 1:
10691                        item_key = item_key_obj.value_object[1]  # type: ignore
10692                    else:
10693                        item_key = tuple(  # type: ignore
10694                            item_key_obj.value_object[key + 1]  # type: ignore
10695                            for key in range(  # type: ignore
10696                                len(params), len(param_types) + 1, 2
10697                            )
10698                        )
10699
10700                    item_value = substrate.decode_scale(  # type: ignore
10701                        type_string=value_type,
10702                        scale_bytes=item[1],
10703                        return_scale_obj=True,
10704                        block_hash=block_hash,
10705                    )
10706                    result_dict.setdefault(storage_function, {})
10707
10708                    result_dict[storage_function][item_key.value] = item_value.value  #  type: ignore
10709
10710        return result_dict
10711
10712    def query_batch(
10713        self, functions: dict[str, list[tuple[str, list[Any]]]]
10714    ) -> dict[str, str]:
10715        """
10716        Executes batch queries on a substrate and returns results in a dictionary format.
10717
10718        Args:
10719            substrate: An instance of SubstrateInterface to interact with the substrate.
10720            functions (dict[str, list[query_call]]): A dictionary mapping module names to lists of query calls (function name and parameters).
10721
10722        Returns:
10723            A dictionary where keys are storage function names and values are the query results.
10724
10725        Raises:
10726            Exception: If no result is found from the batch queries.
10727
10728        Example:
10729            >>> query_batch(substrate_instance, {'module_name': [('function_name', ['param1', 'param2'])]})
10730            {'function_name': 'query_result', ...}
10731        """
10732
10733        result = None
10734        with self.get_conn(init=True) as substrate:
10735            for module, queries in functions.items():
10736                storage_keys: list[Any] = []
10737                for fn, params in queries:
10738                    storage_function = substrate.create_storage_key(  # type: ignore
10739                        pallet=module, storage_function=fn, params=params
10740                    )
10741                    storage_keys.append(storage_function)
10742
10743                block_hash = substrate.get_block_hash()
10744                responses: list[Any] = substrate.query_multi(  # type: ignore
10745                    storage_keys=storage_keys, block_hash=block_hash
10746                )
10747
10748                result: dict[str, str] | None = {}
10749
10750                for item in responses:
10751                    fun = item[0]
10752                    query = item[1]
10753                    storage_fun = fun.storage_function
10754                    result[storage_fun] = query.value
10755
10756            if result is None:
10757                raise Exception("No result")
10758
10759        return result
10760
10761    def query_batch_map(
10762        self,
10763        functions: dict[str, list[tuple[str, list[Any]]]],
10764        block_hash: str | None = None,
10765    ) -> dict[str, dict[Any, Any]]:
10766        """
10767        Queries multiple storage functions using a map batch approach and returns the combined result.
10768
10769        Args:
10770            substrate: An instance of SubstrateInterface for substrate interaction.
10771            functions (dict[str, list[query_call]]): A dictionary mapping module names to lists of query calls.
10772
10773        Returns:
10774            The combined result of the map batch query.
10775
10776        Example:
10777            >>> query_batch_map(substrate_instance, {'module_name': [('function_name', ['param1', 'param2'])]})
10778            # Returns the combined result of the map batch query
10779        """
10780        multi_result: dict[str, dict[Any, Any]] = {}
10781
10782        def recursive_update(
10783            d: dict[str, dict[T1, T2] | dict[str, Any]],
10784            u: Mapping[str, dict[Any, Any] | str],
10785        ) -> dict[str, dict[T1, T2]]:
10786            for k, v in u.items():
10787                if isinstance(v, dict):
10788                    d[k] = recursive_update(d.get(k, {}), v)  # type: ignore
10789                else:
10790                    d[k] = v  # type: ignore
10791            return d  # type: ignore
10792
10793        def get_page():
10794            send, prefix_list = self._get_storage_keys(
10795                storage, queries, block_hash)
10796            with self.get_conn(init=True) as substrate:
10797                function_parameters = self._get_lists(
10798                    storage, queries, substrate)
10799            responses = self._rpc_request_batch(send)
10800            # assumption because send is just the storage_function keys
10801            # so it should always be really small regardless of the amount of queries
10802            assert len(responses) == 1
10803            res = responses[0]
10804            built_payload: list[tuple[str, list[Any]]] = []
10805            for result_keys in res:
10806                built_payload.append(
10807                    ("state_queryStorageAt", [result_keys, block_hash])
10808                )
10809            _, chunks_info = self._make_request_smaller(
10810                built_payload, prefix_list, function_parameters
10811            )
10812            chunks_response, chunks_info = self._rpc_request_batch_chunked(
10813                chunks_info)
10814            return chunks_response, chunks_info
10815
10816        if not block_hash:
10817            with self.get_conn(init=True) as substrate:
10818                block_hash = substrate.get_block_hash()
10819        for storage, queries in functions.items():
10820            chunks, chunks_info = get_page()
10821            # if this doesn't happen something is wrong on the code
10822            # and we won't be able to decode the data properly
10823            assert len(chunks) == len(chunks_info)
10824            for chunk_info, response in zip(chunks_info, chunks):
10825                storage_result = self._decode_response(
10826                    response, chunk_info.fun_params, chunk_info.prefix_list, block_hash
10827                )
10828                multi_result = recursive_update(multi_result, storage_result)
10829
10830        return multi_result
10831
10832    def query(
10833        self,
10834        name: str,
10835        params: list[Any] = [],
10836        module: str = "SubspaceModule",
10837    ) -> Any:
10838        """
10839        Queries a storage function on the network.
10840
10841        Sends a query to the network and retrieves data from a
10842        specified storage function.
10843
10844        Args:
10845            name: The name of the storage function to query.
10846            params: The parameters to pass to the storage function.
10847            module: The module where the storage function is located.
10848
10849        Returns:
10850            The result of the query from the network.
10851
10852        Raises:
10853            NetworkQueryError: If the query fails or is invalid.
10854        """
10855
10856        result = self.query_batch({module: [(name, params)]})
10857
10858        return result[name]
10859
10860    def query_map(
10861        self,
10862        name: str,
10863        params: list[Any] = [],
10864        module: str = "SubspaceModule",
10865        extract_value: bool = True,
10866    ) -> dict[Any, Any]:
10867        """
10868        Queries a storage map from a network node.
10869
10870        Args:
10871            name: The name of the storage map to query.
10872            params: A list of parameters for the query.
10873            module: The module in which the storage map is located.
10874
10875        Returns:
10876            A dictionary representing the key-value pairs
10877              retrieved from the storage map.
10878
10879        Raises:
10880            QueryError: If the query to the network fails or is invalid.
10881        """
10882
10883        result = self.query_batch_map({module: [(name, params)]})
10884
10885        if extract_value:
10886            return {k.value: v.value for k, v in result}  # type: ignore
10887
10888        return result
10889
10890    def compose_call(
10891        self,
10892        fn: str,
10893        params: dict[str, Any],
10894        key: Keypair,
10895        module: str = "SubspaceModule",
10896        wait_for_inclusion: bool = True,
10897        wait_for_finalization: bool | None = None,
10898        sudo: bool = False,
10899    ) -> ExtrinsicReceipt:
10900        """
10901        Composes and submits a call to the network node.
10902
10903        Composes and signs a call with the provided keypair, and submits it to
10904        the network. The call can be a standard extrinsic or a sudo extrinsic if
10905        elevated permissions are required. The method can optionally wait for
10906        the call's inclusion in a block and/or its finalization.
10907
10908        Args:
10909            fn: The function name to call on the network.
10910            params: A dictionary of parameters for the call.
10911            key: The keypair for signing the extrinsic.
10912            module: The module containing the function.
10913            wait_for_inclusion: Wait for the call's inclusion in a block.
10914            wait_for_finalization: Wait for the transaction's finalization.
10915            sudo: Execute the call as a sudo (superuser) operation.
10916
10917        Returns:
10918            The receipt of the submitted extrinsic, if
10919              `wait_for_inclusion` is True. Otherwise, returns a string
10920              identifier of the extrinsic.
10921
10922        Raises:
10923            ChainTransactionError: If the transaction fails.
10924        """
10925
10926        with self.get_conn() as substrate:
10927            if wait_for_finalization is None:
10928                wait_for_finalization = self.wait_for_finalization
10929
10930            call = substrate.compose_call(  # type: ignore
10931                call_module=module, call_function=fn, call_params=params
10932            )
10933            if sudo:
10934                call = substrate.compose_call(  # type: ignore
10935                    call_module="Sudo",
10936                    call_function="sudo",
10937                    call_params={
10938                        "call": call.value,  # type: ignore
10939                    },
10940                )
10941
10942            extrinsic = substrate.create_signed_extrinsic(  # type: ignore
10943                call=call, keypair=key  # type: ignore
10944            )  # type: ignore
10945            response = substrate.submit_extrinsic(
10946                extrinsic=extrinsic,
10947                wait_for_inclusion=wait_for_inclusion,
10948                wait_for_finalization=wait_for_finalization,
10949            )
10950        if wait_for_inclusion:
10951            if not response.is_success:
10952                raise ChainTransactionError(
10953                    response.error_message, response  # type: ignore
10954                )
10955
10956        return response
10957
10958    def compose_call_multisig(
10959        self,
10960        fn: str,
10961        params: dict[str, Any],
10962        key: Keypair,
10963        signatories: list[Ss58Address],
10964        threshold: int,
10965        module: str = "SubspaceModule",
10966        wait_for_inclusion: bool = True,
10967        wait_for_finalization: bool | None = None,
10968        sudo: bool = False,
10969        era: dict[str, int] | None = None,
10970    ) -> ExtrinsicReceipt:
10971        """
10972        Composes and submits a multisignature call to the network node.
10973
10974        This method allows the composition and submission of a call that
10975        requires multiple signatures for execution, known as a multisignature
10976        call. It supports specifying signatories, a threshold of signatures for
10977        the call's execution, and an optional era for the call's mortality. The
10978        call can be a standard extrinsic, a sudo extrinsic for elevated
10979        permissions, or a multisig extrinsic if multiple signatures are
10980        required. Optionally, the method can wait for the call's inclusion in a
10981        block and/or its finalization. Make sure to pass all keys,
10982        that are part of the multisignature.
10983
10984        Args:
10985            fn: The function name to call on the network. params: A dictionary
10986            of parameters for the call. key: The keypair for signing the
10987            extrinsic. signatories: List of SS58 addresses of the signatories.
10988            Include ALL KEYS that are part of the multisig. threshold: The
10989            minimum number of signatories required to execute the extrinsic.
10990            module: The module containing the function to call.
10991            wait_for_inclusion: Whether to wait for the call's inclusion in a
10992            block. wait_for_finalization: Whether to wait for the transaction's
10993            finalization. sudo: Execute the call as a sudo (superuser)
10994            operation. era: Specifies the call's mortality in terms of blocks in
10995            the format
10996                {'period': amount_blocks}. If omitted, the extrinsic is
10997                immortal.
10998
10999        Returns:
11000            The receipt of the submitted extrinsic if `wait_for_inclusion` is
11001            True. Otherwise, returns a string identifier of the extrinsic.
11002
11003        Raises:
11004            ChainTransactionError: If the transaction fails.
11005        """
11006
11007        # getting the call ready
11008        with self.get_conn() as substrate:
11009            if wait_for_finalization is None:
11010                wait_for_finalization = self.wait_for_finalization
11011
11012            # prepares the `GenericCall` object
11013            call = substrate.compose_call(  # type: ignore
11014                call_module=module, call_function=fn, call_params=params
11015            )
11016            if sudo:
11017                call = substrate.compose_call(  # type: ignore
11018                    call_module="Sudo",
11019                    call_function="sudo",
11020                    call_params={
11021                        "call": call.value,  # type: ignore
11022                    },
11023                )
11024
11025            # modify the rpc methods at runtime, to allow for correct payment
11026            # fee calculation parity has a bug in this version,
11027            # where the method has to be removed
11028            rpc_methods = substrate.config.get("rpc_methods")  # type: ignore
11029
11030            if "state_call" in rpc_methods:  # type: ignore
11031                rpc_methods.remove("state_call")  # type: ignore
11032
11033            # create the multisig account
11034            multisig_acc = substrate.generate_multisig_account(  # type: ignore
11035                signatories, threshold
11036            )
11037
11038            # send the multisig extrinsic
11039            extrinsic = substrate.create_multisig_extrinsic(  # type: ignore
11040                call=call,
11041                keypair=key,
11042                multisig_account=multisig_acc,  # type: ignore
11043                era=era,  # type: ignore
11044            )  # type: ignore
11045
11046            response = substrate.submit_extrinsic(
11047                extrinsic=extrinsic,
11048                wait_for_inclusion=wait_for_inclusion,
11049                wait_for_finalization=wait_for_finalization,
11050            )
11051
11052        if wait_for_inclusion:
11053            if not response.is_success:
11054                raise ChainTransactionError(
11055                    response.error_message, response  # type: ignore
11056                )
11057
11058        return response
11059
11060    def transfer(
11061        self,
11062        key: Keypair,
11063        amount: int,
11064        dest: Ss58Address,
11065    ) -> ExtrinsicReceipt:
11066        """
11067        Transfers a specified amount of tokens from the signer's account to the
11068        specified account.
11069
11070        Args:
11071            key: The keypair associated with the sender's account.
11072            amount: The amount to transfer, in nanotokens.
11073            dest: The SS58 address of the recipient.
11074
11075        Returns:
11076            A receipt of the transaction.
11077
11078        Raises:
11079            InsufficientBalanceError: If the sender's account does not have
11080              enough balance.
11081            ChainTransactionError: If the transaction fails.
11082        """
11083
11084        params = {"dest": dest, "value": amount}
11085
11086        return self.compose_call(
11087            module="Balances", fn="transfer", params=params, key=key
11088        )
11089
11090    def transfer_multiple(
11091        self,
11092        key: Keypair,
11093        destinations: list[Ss58Address],
11094        amounts: list[int],
11095        netuid: str | int = 0,
11096    ) -> ExtrinsicReceipt:
11097        """
11098        Transfers specified amounts of tokens from the signer's account to
11099        multiple target accounts.
11100
11101        The `destinations` and `amounts` lists must be of the same length.
11102
11103        Args:
11104            key: The keypair associated with the sender's account.
11105            destinations: A list of SS58 addresses of the recipients.
11106            amounts: Amount to transfer to each recipient, in nanotokens.
11107            netuid: The network identifier.
11108
11109        Returns:
11110            A receipt of the transaction.
11111
11112        Raises:
11113            InsufficientBalanceError: If the sender's account does not have
11114              enough balance for all transfers.
11115            ChainTransactionError: If the transaction fails.
11116        """
11117
11118        assert len(destinations) == len(amounts)
11119
11120        # extract existential deposit from amounts
11121        existential_deposit = self.get_existential_deposit()
11122        amounts = [a - existential_deposit for a in amounts]
11123
11124        params = {
11125            "netuid": netuid,
11126            "destinations": destinations,
11127            "amounts": amounts,
11128        }
11129
11130        return self.compose_call(
11131            module="SubspaceModule", fn="transfer_multiple", params=params, key=key
11132        )
11133
11134    def stake(
11135        self,
11136        key: Keypair,
11137        amount: int,
11138        dest: Ss58Address,
11139        netuid: int = 0,
11140    ) -> ExtrinsicReceipt:
11141        """
11142        Stakes the specified amount of tokens to a module key address.
11143
11144        Args:
11145            key: The keypair associated with the staker's account.
11146            amount: The amount of tokens to stake, in nanotokens.
11147            dest: The SS58 address of the module key to stake to.
11148            netuid: The network identifier.
11149
11150        Returns:
11151            A receipt of the staking transaction.
11152
11153        Raises:
11154            InsufficientBalanceError: If the staker's account does not have
11155              enough balance.
11156            ChainTransactionError: If the transaction fails.
11157        """
11158
11159        params = {"amount": amount, "netuid": netuid, "module_key": dest}
11160
11161        return self.compose_call(fn="add_stake", params=params, key=key)
11162
11163    def unstake(
11164        self,
11165        key: Keypair,
11166        amount: int,
11167        dest: Ss58Address,
11168        netuid: int = 0,
11169    ) -> ExtrinsicReceipt:
11170        """
11171        Unstakes the specified amount of tokens from a module key address.
11172
11173        Args:
11174            key: The keypair associated with the unstaker's account.
11175            amount: The amount of tokens to unstake, in nanotokens.
11176            dest: The SS58 address of the module key to unstake from.
11177            netuid: The network identifier.
11178
11179        Returns:
11180            A receipt of the unstaking transaction.
11181
11182        Raises:
11183            InsufficientStakeError: If the staked key does not have enough
11184              staked tokens by the signer key.
11185            ChainTransactionError: If the transaction fails.
11186        """
11187
11188        params = {"amount": amount, "netuid": netuid, "module_key": dest}
11189        return self.compose_call(fn="remove_stake", params=params, key=key)
11190
11191    def update_module(
11192        self,
11193        key: Keypair,
11194        name: str,
11195        address: str,
11196        metadata: str | None = None,
11197        delegation_fee: int = 20,
11198        netuid: int = 0,
11199    ) -> ExtrinsicReceipt:
11200        """
11201        Updates the parameters of a registered module.
11202
11203        The delegation fee must be an integer between 0 and 100.
11204
11205        Args:
11206            key: The keypair associated with the module's account.
11207            name: The new name for the module. If None, the name is not updated.
11208            address: The new address for the module.
11209                If None, the address is not updated.
11210            delegation_fee: The new delegation fee for the module,
11211                between 0 and 100.
11212            netuid: The network identifier.
11213
11214        Returns:
11215            A receipt of the module update transaction.
11216
11217        Raises:
11218            InvalidParameterError: If the provided parameters are invalid.
11219            ChainTransactionError: If the transaction fails.
11220        """
11221
11222        assert isinstance(delegation_fee, int)
11223
11224        params = {
11225            "netuid": netuid,
11226            "name": name,
11227            "address": address,
11228            "delegation_fee": delegation_fee,
11229            "metadata": metadata,
11230        }
11231
11232        response = self.compose_call("update_module", params=params, key=key)
11233
11234        return response
11235
11236    def register_module(
11237        self,
11238        key: Keypair,
11239        name: str,
11240        address: str | None = None,
11241        subnet: str = "commune",
11242        min_stake: int | None = None,
11243        metadata: str | None = None,
11244    ) -> ExtrinsicReceipt:
11245        """
11246        Registers a new module in the network.
11247
11248        Args:
11249            key: The keypair used for registering the module.
11250            name: The name of the module. If None, a default or previously
11251                set name is used. # How does this work?
11252            address: The address of the module. If None, a default or
11253                previously set address is used. # How does this work?
11254            subnet: The network subnet to register the module in.
11255            min_stake: The minimum stake required for the module, in nanotokens.
11256                If None, a default value is used.
11257
11258        Returns:
11259            A receipt of the registration transaction.
11260
11261        Raises:
11262            InvalidParameterError: If the provided parameters are invalid.
11263            ChainTransactionError: If the transaction fails.
11264        """
11265
11266        stake = self.get_min_stake() if min_stake is None else min_stake
11267
11268        key_addr = key.ss58_address
11269
11270        params = {
11271            "network": subnet,
11272            "address": address,
11273            "name": name,
11274            "stake": stake,
11275            "module_key": key_addr,
11276            "metadata": metadata,
11277        }
11278
11279        response = self.compose_call("register", params=params, key=key)
11280        return response
11281
11282    def vote(
11283        self,
11284        key: Keypair,
11285        uids: list[int],
11286        weights: list[int],
11287        netuid: int = 0,
11288    ) -> ExtrinsicReceipt:
11289        """
11290        Casts votes on a list of module UIDs with corresponding weights.
11291
11292        The length of the UIDs list and the weights list should be the same.
11293        Each weight corresponds to the UID at the same index.
11294
11295        Args:
11296            key: The keypair used for signing the vote transaction.
11297            uids: A list of module UIDs to vote on.
11298            weights: A list of weights corresponding to each UID.
11299            netuid: The network identifier.
11300
11301        Returns:
11302            A receipt of the voting transaction.
11303
11304        Raises:
11305            InvalidParameterError: If the lengths of UIDs and weights lists
11306                do not match.
11307            ChainTransactionError: If the transaction fails.
11308        """
11309
11310        assert len(uids) == len(weights)
11311
11312        params = {
11313            "uids": uids,
11314            "weights": weights,
11315            "netuid": netuid,
11316        }
11317
11318        response = self.compose_call("set_weights", params=params, key=key)
11319
11320        return response
11321
11322    def update_subnet(
11323        self,
11324        key: Keypair,
11325        params: SubnetParams,
11326        netuid: int = 0,
11327    ) -> ExtrinsicReceipt:
11328        """
11329        Update a subnet's configuration.
11330
11331        It requires the founder key for authorization.
11332
11333        Args:
11334            key: The founder keypair of the subnet.
11335            params: The new parameters for the subnet.
11336            netuid: The network identifier.
11337
11338        Returns:
11339            A receipt of the subnet update transaction.
11340
11341        Raises:
11342            AuthorizationError: If the key is not authorized.
11343            ChainTransactionError: If the transaction fails.
11344        """
11345
11346        general_params = dict(params)
11347        general_params["netuid"] = netuid
11348
11349        response = self.compose_call(
11350            fn="update_subnet",
11351            params=general_params,
11352            key=key,
11353        )
11354
11355        return response
11356
11357    def transfer_stake(
11358        self,
11359        key: Keypair,
11360        amount: int,
11361        from_module_key: Ss58Address,
11362        dest_module_address: Ss58Address,
11363        netuid: int = 0,
11364    ) -> ExtrinsicReceipt:
11365        """
11366        Realocate staked tokens from one staked module to another module.
11367
11368        Args:
11369            key: The keypair associated with the account that is delegating the tokens.
11370            amount: The amount of staked tokens to transfer, in nanotokens.
11371            from_module_key: The SS58 address of the module you want to transfer from (currently delegated by the key).
11372            dest_module_address: The SS58 address of the destination (newly delegated key).
11373            netuid: The network identifier.
11374
11375        Returns:
11376            A receipt of the stake transfer transaction.
11377
11378        Raises:
11379            InsufficientStakeError: If the source module key does not have
11380            enough staked tokens. ChainTransactionError: If the transaction
11381            fails.
11382        """
11383
11384        amount = amount - self.get_existential_deposit()
11385
11386        params = {
11387            "amount": amount,
11388            "netuid": netuid,
11389            "module_key": from_module_key,
11390            "new_module_key": dest_module_address,
11391        }
11392
11393        response = self.compose_call("transfer_stake", key=key, params=params)
11394
11395        return response
11396
11397    def multiunstake(
11398        self,
11399        key: Keypair,
11400        keys: list[Ss58Address],
11401        amounts: list[int],
11402        netuid: int = 0,
11403    ) -> ExtrinsicReceipt:
11404        """
11405        Unstakes tokens from multiple module keys.
11406
11407        And the lists `keys` and `amounts` must be of the same length. Each
11408        amount corresponds to the module key at the same index.
11409
11410        Args:
11411            key: The keypair associated with the unstaker's account.
11412            keys: A list of SS58 addresses of the module keys to unstake from.
11413            amounts: A list of amounts to unstake from each module key,
11414              in nanotokens.
11415            netuid: The network identifier.
11416
11417        Returns:
11418            A receipt of the multi-unstaking transaction.
11419
11420        Raises:
11421            MismatchedLengthError: If the lengths of keys and amounts lists do
11422            not match. InsufficientStakeError: If any of the module keys do not
11423            have enough staked tokens. ChainTransactionError: If the transaction
11424            fails.
11425        """
11426
11427        assert len(keys) == len(amounts)
11428
11429        params = {"netuid": netuid, "module_keys": keys, "amounts": amounts}
11430
11431        response = self.compose_call(
11432            "remove_stake_multiple", params=params, key=key)
11433
11434        return response
11435
11436    def multistake(
11437        self,
11438        key: Keypair,
11439        keys: list[Ss58Address],
11440        amounts: list[int],
11441        netuid: int = 0,
11442    ) -> ExtrinsicReceipt:
11443        """
11444        Stakes tokens to multiple module keys.
11445
11446        The lengths of the `keys` and `amounts` lists must be the same. Each
11447        amount corresponds to the module key at the same index.
11448
11449        Args:
11450            key: The keypair associated with the staker's account.
11451            keys: A list of SS58 addresses of the module keys to stake to.
11452            amounts: A list of amounts to stake to each module key,
11453                in nanotokens.
11454            netuid: The network identifier.
11455
11456        Returns:
11457            A receipt of the multi-staking transaction.
11458
11459        Raises:
11460            MismatchedLengthError: If the lengths of keys and amounts lists
11461                do not match.
11462            ChainTransactionError: If the transaction fails.
11463        """
11464
11465        assert len(keys) == len(amounts)
11466
11467        params = {
11468            "module_keys": keys,
11469            "amounts": amounts,
11470            "netuid": netuid,
11471        }
11472
11473        response = self.compose_call(
11474            "add_stake_multiple", params=params, key=key)
11475
11476        return response
11477
11478    def add_profit_shares(
11479        self,
11480        key: Keypair,
11481        keys: list[Ss58Address],
11482        shares: list[int],
11483    ) -> ExtrinsicReceipt:
11484        """
11485        Allocates profit shares to multiple keys.
11486
11487        The lists `keys` and `shares` must be of the same length,
11488        with each share amount corresponding to the key at the same index.
11489
11490        Args:
11491            key: The keypair associated with the account
11492                distributing the shares.
11493            keys: A list of SS58 addresses to allocate shares to.
11494            shares: A list of share amounts to allocate to each key,
11495                in nanotokens.
11496
11497        Returns:
11498            A receipt of the profit sharing transaction.
11499
11500        Raises:
11501            MismatchedLengthError: If the lengths of keys and shares
11502                lists do not match.
11503            ChainTransactionError: If the transaction fails.
11504        """
11505
11506        assert len(keys) == len(shares)
11507
11508        params = {"keys": keys, "shares": shares}
11509
11510        response = self.compose_call(
11511            "add_profit_shares", params=params, key=key)
11512
11513        return response
11514
11515    def add_subnet_proposal(
11516        self, key: Keypair, params: SubnetParams, netuid: int = 0
11517    ) -> ExtrinsicReceipt:
11518        """
11519        Submits a proposal for creating or modifying a subnet within the
11520        network.
11521
11522        The proposal includes various parameters like the name, founder, share
11523        allocations, and other subnet-specific settings.
11524
11525        Args:
11526            key: The keypair used for signing the proposal transaction.
11527            params: The parameters for the subnet proposal.
11528            netuid: The network identifier.
11529
11530        Returns:
11531            A receipt of the subnet proposal transaction.
11532
11533        Raises:
11534            InvalidParameterError: If the provided subnet
11535                parameters are invalid.
11536            ChainTransactionError: If the transaction fails.
11537        """
11538
11539        general_params = dict(params)
11540        general_params["netuid"] = netuid
11541
11542        response = self.compose_call(
11543            fn="add_subnet_proposal",
11544            params=general_params,
11545            key=key,
11546        )
11547
11548        return response
11549
11550    def add_custom_proposal(
11551        self,
11552        key: Keypair,
11553        cid: str,
11554    ) -> ExtrinsicReceipt:
11555
11556        params = {"data": cid}
11557
11558        response = self.compose_call(
11559            fn="add_custom_proposal", params=params, key=key)
11560        return response
11561
11562    def add_custom_subnet_proposal(
11563        self,
11564        key: Keypair,
11565        cid: str,
11566        netuid: int = 0,
11567    ) -> ExtrinsicReceipt:
11568        """
11569        Submits a proposal for creating or modifying a custom subnet within the
11570        network.
11571
11572        The proposal includes various parameters like the name, founder, share
11573        allocations, and other subnet-specific settings.
11574
11575        Args:
11576            key: The keypair used for signing the proposal transaction.
11577            params: The parameters for the subnet proposal.
11578            netuid: The network identifier.
11579
11580        Returns:
11581            A receipt of the subnet proposal transaction.
11582        """
11583
11584        params = {
11585            "data": cid,
11586            "netuid": netuid,
11587        }
11588
11589        response = self.compose_call(
11590            fn="add_custom_subnet_proposal",
11591            params=params,
11592            key=key,
11593        )
11594
11595        return response
11596
11597    def add_global_proposal(
11598        self,
11599        key: Keypair,
11600        params: NetworkParams,
11601    ) -> ExtrinsicReceipt:
11602        """
11603        Submits a proposal for altering the global network parameters.
11604
11605        Allows for the submission of a proposal to
11606        change various global parameters
11607        of the network, such as emission rates, rate limits, and voting
11608        thresholds. It is used to
11609        suggest changes that affect the entire network's operation.
11610
11611        Args:
11612            key: The keypair used for signing the proposal transaction.
11613            params: A dictionary containing global network parameters
11614                    like maximum allowed subnets, modules,
11615                    transaction rate limits, and others.
11616
11617        Returns:
11618            A receipt of the global proposal transaction.
11619
11620        Raises:
11621            InvalidParameterError: If the provided network
11622                parameters are invalid.
11623            ChainTransactionError: If the transaction fails.
11624        """
11625
11626        general_params = vars(params)
11627        response = self.compose_call(
11628            fn="add_global_proposal",
11629            params=general_params,
11630            key=key,
11631        )
11632
11633        return response
11634
11635    def vote_on_proposal(
11636        self,
11637        key: Keypair,
11638        proposal_id: int,
11639        agree: bool,
11640    ) -> ExtrinsicReceipt:
11641        """
11642        Casts a vote on a specified proposal within the network.
11643
11644        Args:
11645            key: The keypair used for signing the vote transaction.
11646            proposal_id: The unique identifier of the proposal to vote on.
11647
11648        Returns:
11649            A receipt of the voting transaction in nanotokens.
11650
11651        Raises:
11652            InvalidProposalIDError: If the provided proposal ID does not
11653                exist or is invalid.
11654            ChainTransactionError: If the transaction fails.
11655        """
11656
11657        params = {"proposal_id": proposal_id, "agree": agree}
11658
11659        response = self.compose_call("vote_proposal", key=key, params=params)
11660
11661        return response
11662
11663    def unvote_on_proposal(
11664        self,
11665        key: Keypair,
11666        proposal_id: int,
11667    ) -> ExtrinsicReceipt:
11668        """
11669        Retracts a previously cast vote on a specified proposal.
11670
11671        Args:
11672            key: The keypair used for signing the unvote transaction.
11673            proposal_id: The unique identifier of the proposal to withdraw the
11674                vote from.
11675
11676        Returns:
11677            A receipt of the unvoting transaction in nanotokens.
11678
11679        Raises:
11680            InvalidProposalIDError: If the provided proposal ID does not
11681                exist or is invalid.
11682            ChainTransactionError: If the transaction fails to be processed, or
11683                if there was no prior vote to retract.
11684        """
11685
11686        params = {"proposal_id": proposal_id}
11687
11688        response = self.compose_call("unvote_proposal", key=key, params=params)
11689
11690        return response
11691
11692    def add_dao_application(
11693        self, key: Keypair, application_key: Ss58Address, data: str
11694    ) -> ExtrinsicReceipt:
11695        """
11696        Submits a new application to the general subnet DAO.
11697
11698        Args:
11699            key: The keypair used for signing the application transaction.
11700            application_key: The SS58 address of the application key.
11701            data: The data associated with the application.
11702
11703        Returns:
11704            A receipt of the application transaction.
11705
11706        Raises:
11707            ChainTransactionError: If the transaction fails.
11708        """
11709
11710        params = {"application_key": application_key, "data": data}
11711
11712        response = self.compose_call(
11713            "add_dao_application", key=key, params=params)
11714
11715        return response
11716
11717    def query_map_curator_applications(self) -> dict[str, dict[str, str]]:
11718        query_result = self.query_map(
11719            "CuratorApplications", params=[], extract_value=False)
11720        applications = query_result.get("CuratorApplications", {})
11721        return applications
11722
11723    def query_map_proposals(
11724        self, extract_value: bool = False
11725    ) -> dict[int, dict[str, Any]]:
11726        """
11727        Retrieves a mappping of proposals from the network.
11728
11729        Queries the network and returns a mapping of proposal IDs to
11730        their respective parameters.
11731
11732        Returns:
11733            A dictionary mapping proposal IDs
11734            to dictionaries of their parameters.
11735
11736        Raises:
11737            QueryError: If the query to the network fails or is invalid.
11738        """
11739
11740        return self.query_map("Proposals", extract_value=extract_value)["Proposals"]
11741
11742    def query_map_weights(
11743        self, netuid: int = 0, extract_value: bool = False
11744    ) -> dict[int, list[int]]:
11745        """
11746        Retrieves a mapping of weights for keys on the network.
11747
11748        Queries the network and returns a mapping of key UIDs to
11749        their respective weights.
11750
11751        Args:
11752            netuid: The network UID from which to get the weights.
11753
11754        Returns:
11755            A dictionary mapping key UIDs to lists of their weights.
11756
11757        Raises:
11758            QueryError: If the query to the network fails or is invalid.
11759        """
11760
11761        return self.query_map("Weights", [netuid], extract_value=extract_value)[
11762            "Weights"
11763        ]
11764
11765    def query_map_key(
11766        self,
11767        netuid: int = 0,
11768        extract_value: bool = False,
11769    ) -> dict[int, Ss58Address]:
11770        """
11771        Retrieves a map of keys from the network.
11772
11773        Fetches a mapping of key UIDs to their associated
11774        addresses on the network.
11775        The query can be targeted at a specific network UID if required.
11776
11777        Args:
11778            netuid: The network UID from which to get the keys.
11779
11780        Returns:
11781            A dictionary mapping key UIDs to their addresses.
11782
11783        Raises:
11784            QueryError: If the query to the network fails or is invalid.
11785        """
11786        return self.query_map("Keys", [netuid], extract_value=extract_value)["Keys"]
11787
11788    def query_map_address(
11789        self, netuid: int = 0, extract_value: bool = False
11790    ) -> dict[int, str]:
11791        """
11792        Retrieves a map of key addresses from the network.
11793
11794        Queries the network for a mapping of key UIDs to their addresses.
11795
11796        Args:
11797            netuid: The network UID from which to get the addresses.
11798
11799        Returns:
11800            A dictionary mapping key UIDs to their addresses.
11801
11802        Raises:
11803            QueryError: If the query to the network fails or is invalid.
11804        """
11805
11806        return self.query_map("Address", [netuid], extract_value=extract_value)[
11807            "Address"
11808        ]
11809
11810    def query_map_emission(self, extract_value: bool = False) -> dict[int, list[int]]:
11811        """
11812        Retrieves a map of emissions for keys on the network.
11813
11814        Queries the network to get a mapping of
11815        key UIDs to their emission values.
11816
11817        Returns:
11818            A dictionary mapping key UIDs to lists of their emission values.
11819
11820        Raises:
11821            QueryError: If the query to the network fails or is invalid.
11822        """
11823
11824        return self.query_map("Emission", extract_value=extract_value)["Emission"]
11825
11826    def query_map_incentive(self, extract_value: bool = False) -> dict[int, list[int]]:
11827        """
11828        Retrieves a mapping of incentives for keys on the network.
11829
11830        Queries the network and returns a mapping of key UIDs to
11831        their respective incentive values.
11832
11833        Returns:
11834            A dictionary mapping key UIDs to lists of their incentive values.
11835
11836        Raises:
11837            QueryError: If the query to the network fails or is invalid.
11838        """
11839
11840        return self.query_map("Incentive", extract_value=extract_value)["Incentive"]
11841
11842    def query_map_dividend(self, extract_value: bool = False) -> dict[int, list[int]]:
11843        """
11844        Retrieves a mapping of dividends for keys on the network.
11845
11846        Queries the network for a mapping of key UIDs to
11847        their dividend values.
11848
11849        Returns:
11850            A dictionary mapping key UIDs to lists of their dividend values.
11851
11852        Raises:
11853            QueryError: If the query to the network fails or is invalid.
11854        """
11855
11856        return self.query_map("Dividends", extract_value=extract_value)["Dividends"]
11857
11858    def query_map_regblock(
11859        self, netuid: int = 0, extract_value: bool = False
11860    ) -> dict[int, int]:
11861        """
11862        Retrieves a mapping of registration blocks for keys on the network.
11863
11864        Queries the network for a mapping of key UIDs to
11865        the blocks where they were registered.
11866
11867        Args:
11868            netuid: The network UID from which to get the registration blocks.
11869
11870        Returns:
11871            A dictionary mapping key UIDs to their registration blocks.
11872
11873        Raises:
11874            QueryError: If the query to the network fails or is invalid.
11875        """
11876
11877        return self.query_map(
11878            "RegistrationBlock", [netuid], extract_value=extract_value
11879        )["RegistrationBlock"]
11880
11881    def query_map_lastupdate(self, extract_value: bool = False) -> dict[int, list[int]]:
11882        """
11883        Retrieves a mapping of the last update times for keys on the network.
11884
11885        Queries the network for a mapping of key UIDs to their last update times.
11886
11887        Returns:
11888            A dictionary mapping key UIDs to lists of their last update times.
11889
11890        Raises:
11891            QueryError: If the query to the network fails or is invalid.
11892        """
11893
11894        return self.query_map("LastUpdate", extract_value=extract_value)["LastUpdate"]
11895
11896    def query_map_total_stake(self, extract_value: bool = False) -> dict[int, int]:
11897        """
11898        Retrieves a mapping of total stakes for keys on the network.
11899
11900        Queries the network for a mapping of key UIDs to their total stake amounts.
11901
11902        Returns:
11903            A dictionary mapping key UIDs to their total stake amounts.
11904
11905        Raises:
11906            QueryError: If the query to the network fails or is invalid.
11907        """
11908
11909        return self.query_map("TotalStake", extract_value=extract_value)["TotalStake"]
11910
11911    def query_map_stakefrom(
11912        self, netuid: int = 0, extract_value: bool = False
11913    ) -> dict[str, list[tuple[str, int]]]:
11914        """
11915        Retrieves a mapping of stakes from various sources for keys on the network.
11916
11917        Queries the network to obtain a mapping of key addresses to the sources
11918        and amounts of stakes they have received.
11919
11920        Args:
11921            netuid: The network UID from which to get the stakes.
11922
11923        Returns:
11924            A dictionary mapping key addresses to lists of tuples
11925            (module_key_address, amount).
11926
11927        Raises:
11928            QueryError: If the query to the network fails or is invalid.
11929        """
11930
11931        return self.query_map("StakeFrom", [netuid], extract_value=extract_value)[
11932            "StakeFrom"
11933        ]
11934
11935    def query_map_staketo(
11936        self, netuid: int = 0, extract_value: bool = False
11937    ) -> dict[str, list[tuple[str, int]]]:
11938        """
11939        Retrieves a mapping of stakes to destinations for keys on the network.
11940
11941        Queries the network for a mapping of key addresses to the destinations
11942        and amounts of stakes they have made.
11943
11944        Args:
11945            netuid: The network UID from which to get the stakes.
11946
11947        Returns:
11948            A dictionary mapping key addresses to lists of tuples
11949            (module_key_address, amount).
11950
11951        Raises:
11952            QueryError: If the query to the network fails or is invalid.
11953        """
11954
11955        return self.query_map("StakeTo", [netuid], extract_value=extract_value)[
11956            "StakeTo"
11957        ]
11958
11959    def query_map_stake(
11960        self, netuid: int = 0, extract_value: bool = False
11961    ) -> dict[str, int]:
11962        """
11963        Retrieves a mapping of stakes for keys on the network.
11964
11965        Queries the network and returns a mapping of key addresses to their
11966        respective delegated staked balances amounts.
11967        The query can be targeted at a specific network UID if required.
11968
11969        Args:
11970            netuid: The network UID from which to get the stakes.
11971
11972        Returns:
11973            A dictionary mapping key addresses to their stake amounts.
11974
11975        Raises:
11976            QueryError: If the query to the network fails or is invalid.
11977        """
11978
11979        return self.query_map("Stake", [netuid], extract_value=extract_value)["Stake"]
11980
11981    def query_map_delegationfee(
11982        self, netuid: int = 0, extract_value: bool = False
11983    ) -> dict[str, int]:
11984        """
11985        Retrieves a mapping of delegation fees for keys on the network.
11986
11987        Queries the network to obtain a mapping of key addresses to their
11988        respective delegation fees.
11989
11990        Args:
11991            netuid: The network UID to filter the delegation fees.
11992
11993        Returns:
11994            A dictionary mapping key addresses to their delegation fees.
11995
11996        Raises:
11997            QueryError: If the query to the network fails or is invalid.
11998        """
11999
12000        return self.query_map("DelegationFee", [netuid], extract_value=extract_value)[
12001            "DelegationFee"
12002        ]
12003
12004    def query_map_tempo(self, extract_value: bool = False) -> dict[int, int]:
12005        """
12006        Retrieves a mapping of tempo settings for the network.
12007
12008        Queries the network to obtain the tempo (rate of reward distributions)
12009        settings for various network subnets.
12010
12011        Returns:
12012            A dictionary mapping network UIDs to their tempo settings.
12013
12014        Raises:
12015            QueryError: If the query to the network fails or is invalid.
12016        """
12017
12018        return self.query_map("Tempo", extract_value=extract_value)["Tempo"]
12019
12020    def query_map_immunity_period(self, extract_value: bool) -> dict[int, int]:
12021        """
12022        Retrieves a mapping of immunity periods for the network.
12023
12024        Queries the network for the immunity period settings,
12025        which represent the time duration during which modules
12026        can not get deregistered.
12027
12028        Returns:
12029            A dictionary mapping network UIDs to their immunity period settings.
12030
12031        Raises:
12032            QueryError: If the query to the network fails or is invalid.
12033        """
12034
12035        return self.query_map("ImmunityPeriod", extract_value=extract_value)[
12036            "ImmunityPeriod"
12037        ]
12038
12039    def query_map_min_allowed_weights(
12040        self, extract_value: bool = False
12041    ) -> dict[int, int]:
12042        """
12043        Retrieves a mapping of minimum allowed weights for the network.
12044
12045        Queries the network to obtain the minimum allowed weights,
12046        which are the lowest permissible weight values that can be set by
12047        validators.
12048
12049        Returns:
12050            A dictionary mapping network UIDs to
12051            their minimum allowed weight values.
12052
12053        Raises:
12054            QueryError: If the query to the network fails or is invalid.
12055        """
12056
12057        return self.query_map("MinAllowedWeights", extract_value=extract_value)[
12058            "MinAllowedWeights"
12059        ]
12060
12061    def query_map_max_allowed_weights(
12062        self, extract_value: bool = False
12063    ) -> dict[int, int]:
12064        """
12065        Retrieves a mapping of maximum allowed weights for the network.
12066
12067        Queries the network for the maximum allowed weights,
12068        which are the highest permissible
12069        weight values that can be set by validators.
12070
12071        Returns:
12072            A dictionary mapping network UIDs to
12073            their maximum allowed weight values.
12074
12075        Raises:
12076            QueryError: If the query to the network fails or is invalid.
12077        """
12078
12079        return self.query_map("MaxAllowedWeights", extract_value=extract_value)[
12080            "MaxAllowedWeights"
12081        ]
12082
12083    def query_map_max_allowed_uids(self, extract_value: bool = False) -> dict[int, int]:
12084        """
12085        Queries the network for the maximum number of allowed user IDs (UIDs)
12086        for each network subnet.
12087
12088        Fetches a mapping of network subnets to their respective
12089        limits on the number of user IDs that can be created or used.
12090
12091        Returns:
12092            A dictionary mapping network UIDs (unique identifiers) to their
12093            maximum allowed number of UIDs.
12094            Each entry represents a network subnet
12095            with its corresponding UID limit.
12096
12097        Raises:
12098            QueryError: If the query to the network fails or is invalid.
12099        """
12100
12101        return self.query_map("MaxAllowedUids", extract_value=extract_value)[
12102            "MaxAllowedUids"
12103        ]
12104
12105    def query_map_min_stake(self, extract_value: bool = False) -> dict[int, int]:
12106        """
12107        Retrieves a mapping of minimum allowed stake on the network.
12108
12109        Queries the network to obtain the minimum number of stake,
12110        which is represented in nanotokens.
12111
12112        Returns:
12113            A dictionary mapping network UIDs to
12114            their minimum allowed stake values.
12115
12116        Raises:
12117            QueryError: If the query to the network fails or is invalid.
12118        """
12119
12120        return self.query_map("MinStake", extract_value=extract_value)["MinStake"]
12121
12122    def query_map_max_stake(self, extract_value: bool = False) -> dict[int, int]:
12123        """
12124        Retrieves a mapping of the maximum stake values for the network.
12125
12126        Queries the network for the maximum stake values across various s
12127        ubnets of the network.
12128
12129        Returns:
12130            A dictionary mapping network UIDs to their maximum stake values.
12131
12132        Raises:
12133            QueryError: If the query to the network fails or is invalid.
12134        """
12135
12136        return self.query_map("MaxStake", extract_value=extract_value)["MaxStake"]
12137
12138    def query_map_founder(self, extract_value: bool = False) -> dict[int, str]:
12139        """
12140        Retrieves a mapping of founders for the network.
12141
12142        Queries the network to obtain the founders associated with
12143        various subnets.
12144
12145        Returns:
12146            A dictionary mapping network UIDs to their respective founders.
12147
12148        Raises:
12149            QueryError: If the query to the network fails or is invalid.
12150        """
12151
12152        return self.query_map("Founder", extract_value=extract_value)["Founder"]
12153
12154    def query_map_founder_share(self, extract_value: bool = False) -> dict[int, int]:
12155        """
12156        Retrieves a mapping of founder shares for the network.
12157
12158        Queries the network for the share percentages
12159        allocated to founders across different subnets.
12160
12161        Returns:
12162            A dictionary mapping network UIDs to their founder share percentages.
12163
12164        Raises:
12165            QueryError: If the query to the network fails or is invalid.
12166        """
12167
12168        return self.query_map("FounderShare", extract_value=extract_value)[
12169            "FounderShare"
12170        ]
12171
12172    def query_map_incentive_ratio(self, extract_value: bool = False) -> dict[int, int]:
12173        """
12174        Retrieves a mapping of incentive ratios for the network.
12175
12176        Queries the network for the incentive ratios,
12177        which are the proportions of rewards or incentives
12178        allocated in different subnets of the network.
12179
12180        Returns:
12181            A dictionary mapping network UIDs to their incentive ratios.
12182
12183        Raises:
12184            QueryError: If the query to the network fails or is invalid.
12185        """
12186
12187        return self.query_map("IncentiveRatio", extract_value=extract_value)[
12188            "IncentiveRatio"
12189        ]
12190
12191    def query_map_trust_ratio(self, extract_value: bool = False) -> dict[int, int]:
12192        """
12193        Retrieves a mapping of trust ratios for the network.
12194
12195        Queries the network for trust ratios,
12196        indicative of the level of trust or credibility assigned
12197        to different subnets of the network.
12198
12199        Returns:
12200            A dictionary mapping network UIDs to their trust ratios.
12201
12202        Raises:
12203            QueryError: If the query to the network fails or is invalid.
12204        """
12205
12206        return self.query_map("TrustRatio", extract_value=extract_value)["TrustRatio"]
12207
12208    def query_map_vote_mode_subnet(self, extract_value: bool = False) -> dict[int, str]:
12209        """
12210        Retrieves a mapping of vote modes for subnets within the network.
12211
12212        Queries the network for the voting modes used in different
12213        subnets, which define the methodology or approach of voting within those
12214        subnets.
12215
12216        Returns:
12217            A dictionary mapping network UIDs to their vote
12218            modes for subnets.
12219
12220        Raises:
12221            QueryError: If the query to the network fails or is invalid.
12222        """
12223
12224        return self.query_map("VoteModeSubnet", extract_value=extract_value)[
12225            "VoteModeSubnet"
12226        ]
12227
12228    def query_map_legit_whitelist(
12229        self, extract_value: bool = False
12230    ) -> dict[Ss58Address, int]:
12231        """
12232        Retrieves a mapping of whitelisted addresses for the network.
12233
12234        Queries the network for a mapping of whitelisted addresses
12235        and their respective legitimacy status.
12236
12237        Returns:
12238            A dictionary mapping addresses to their legitimacy status.
12239
12240        Raises:
12241            QueryError: If the query to the network fails or is invalid.
12242        """
12243
12244        return self.query_map("LegitWhitelist", extract_value=extract_value)[
12245            "LegitWhitelist"
12246        ]
12247
12248    def query_map_subnet_names(self, extract_value: bool = False) -> dict[int, str]:
12249        """
12250        Retrieves a mapping of subnet names within the network.
12251
12252        Queries the network for the names of various subnets,
12253        providing an overview of the different
12254        subnets within the network.
12255
12256        Returns:
12257            A dictionary mapping network UIDs to their subnet names.
12258
12259        Raises:
12260            QueryError: If the query to the network fails or is invalid.
12261        """
12262
12263        return self.query_map("SubnetNames", extract_value=extract_value)["SubnetNames"]
12264
12265    def query_map_balances(
12266        self, extract_value: bool = False
12267    ) -> dict[str, dict["str", int | dict[str, int]]]:
12268        """
12269        Retrieves a mapping of account balances within the network.
12270
12271        Queries the network for the balances associated with different accounts.
12272        It provides detailed information including various types of
12273        balances for each account.
12274
12275        Returns:
12276            A dictionary mapping account addresses to their balance details.
12277
12278        Raises:
12279            QueryError: If the query to the network fails or is invalid.
12280        """
12281
12282        return self.query_map("Account", module="System", extract_value=extract_value)[
12283            "Account"
12284        ]
12285
12286    def query_map_registration_blocks(
12287        self, netuid: int = 0, extract_value: bool = False
12288    ) -> dict[int, int]:
12289        """
12290        Retrieves a mapping of registration blocks for UIDs on the network.
12291
12292        Queries the network to find the block numbers at which various
12293        UIDs were registered.
12294
12295        Args:
12296            netuid: The network UID from which to get the registrations.
12297
12298        Returns:
12299            A dictionary mapping UIDs to their registration block numbers.
12300
12301        Raises:
12302            QueryError: If the query to the network fails or is invalid.
12303        """
12304
12305        return self.query_map(
12306            "RegistrationBlock", [netuid], extract_value=extract_value
12307        )["RegistrationBlock"]
12308
12309    def query_map_name(
12310        self, netuid: int = 0, extract_value: bool = False
12311    ) -> dict[int, str]:
12312        """
12313        Retrieves a mapping of names for keys on the network.
12314
12315        Queries the network for the names associated with different keys.
12316        It provides a mapping of key UIDs to their registered names.
12317
12318        Args:
12319            netuid: The network UID from which to get the names.
12320
12321        Returns:
12322            A dictionary mapping key UIDs to their names.
12323
12324        Raises:
12325            QueryError: If the query to the network fails or is invalid.
12326        """
12327
12328        return self.query_map("Name", [netuid], extract_value=extract_value)["Name"]
12329
12330    #  == QUERY FUNCTIONS == #
12331
12332    def get_immunity_period(self, netuid: int = 0) -> int:
12333        """
12334        Queries the network for the immunity period setting.
12335
12336        The immunity period is a time duration during which a module
12337        can not be deregistered from the network.
12338        Fetches the immunity period for a specified network subnet.
12339
12340        Args:
12341            netuid: The network UID for which to query the immunity period.
12342
12343        Returns:
12344            The immunity period setting for the specified network subnet.
12345
12346        Raises:
12347            QueryError: If the query to the network fails or is invalid.
12348        """
12349
12350        return self.query(
12351            "ImmunityPeriod",
12352            params=[netuid],
12353        )
12354
12355    def get_max_set_weights_per_epoch(self):
12356        return self.query("MaximumSetWeightCallsPerEpoch")
12357
12358    def get_min_allowed_weights(self, netuid: int = 0) -> int:
12359        """
12360        Queries the network for the minimum allowed weights setting.
12361
12362        Retrieves the minimum weight values that are possible to set
12363        by a validator within a specific network subnet.
12364
12365        Args:
12366            netuid: The network UID for which to query the minimum allowed
12367              weights.
12368
12369        Returns:
12370            The minimum allowed weight values for the specified network
12371              subnet.
12372
12373        Raises:
12374            QueryError: If the query to the network fails or is invalid.
12375        """
12376
12377        return self.query(
12378            "MinAllowedWeights",
12379            params=[netuid],
12380        )
12381
12382    def get_max_allowed_weights(self, netuid: int = 0) -> int:
12383        """
12384        Queries the network for the maximum allowed weights setting.
12385
12386        Retrieves the maximum weight values that are possible to set
12387        by a validator within a specific network subnet.
12388
12389        Args:
12390            netuid: The network UID for which to query the maximum allowed
12391              weights.
12392
12393        Returns:
12394            The maximum allowed weight values for the specified network
12395              subnet.
12396
12397        Raises:
12398            QueryError: If the query to the network fails or is invalid.
12399        """
12400
12401        return self.query("MaxAllowedWeights", params=[netuid])
12402
12403    def get_max_allowed_uids(self, netuid: int = 0) -> int:
12404        """
12405        Queries the network for the maximum allowed UIDs setting.
12406
12407        Fetches the upper limit on the number of user IDs that can
12408        be allocated or used within a specific network subnet.
12409
12410        Args:
12411            netuid: The network UID for which to query the maximum allowed UIDs.
12412
12413        Returns:
12414            The maximum number of allowed UIDs for the specified network subnet.
12415
12416        Raises:
12417            QueryError: If the query to the network fails or is invalid.
12418        """
12419
12420        return self.query("MaxAllowedUids", params=[netuid])
12421
12422    def get_name(self, netuid: int = 0) -> str:
12423        """
12424        Queries the network for the name of a specific subnet.
12425
12426        Args:
12427            netuid: The network UID for which to query the name.
12428
12429        Returns:
12430            The name of the specified network subnet.
12431
12432        Raises:
12433            QueryError: If the query to the network fails or is invalid.
12434        """
12435
12436        return self.query("Name", params=[netuid])
12437
12438    def get_subnet_name(self, netuid: int = 0) -> str:
12439        """
12440        Queries the network for the name of a specific subnet.
12441
12442        Args:
12443            netuid: The network UID for which to query the name.
12444
12445        Returns:
12446            The name of the specified network subnet.
12447
12448        Raises:
12449            QueryError: If the query to the network fails or is invalid.
12450        """
12451
12452        return self.query("SubnetNames", params=[netuid])
12453
12454    def get_global_dao_treasury(self):
12455        return self.query("GlobalDaoTreasury")
12456
12457    def get_n(self, netuid: int = 0) -> int:
12458        """
12459        Queries the network for the 'N' hyperparameter, which represents how
12460        many modules are on the network.
12461
12462        Args:
12463            netuid: The network UID for which to query the 'N' hyperparameter.
12464
12465        Returns:
12466            The value of the 'N' hyperparameter for the specified network
12467              subnet.
12468
12469        Raises:
12470            QueryError: If the query to the network fails or is invalid.
12471        """
12472
12473        return self.query("N", params=[netuid])
12474
12475    def get_tempo(self, netuid: int = 0) -> int:
12476        """
12477        Queries the network for the tempo setting, measured in blocks, for the
12478        specified subnet.
12479
12480        Args:
12481            netuid: The network UID for which to query the tempo.
12482
12483        Returns:
12484            The tempo setting for the specified subnet.
12485
12486        Raises:
12487            QueryError: If the query to the network fails or is invalid.
12488        """
12489
12490        return self.query("Tempo", params=[netuid])
12491
12492    def get_total_stake(self, netuid: int = 0):
12493        """
12494        Queries the network for the total stake amount.
12495
12496        Retrieves the total amount of stake within a specific network subnet.
12497
12498        Args:
12499            netuid: The network UID for which to query the total stake.
12500
12501        Returns:
12502            The total stake amount for the specified network subnet.
12503
12504        Raises:
12505            QueryError: If the query to the network fails or is invalid.
12506        """
12507
12508        return self.query(
12509            "TotalStake",
12510            params=[netuid],
12511        )
12512
12513    def get_registrations_per_block(self):
12514        """
12515        Queries the network for the number of registrations per block.
12516
12517        Fetches the number of registrations that are processed per
12518        block within the network.
12519
12520        Returns:
12521            The number of registrations processed per block.
12522
12523        Raises:
12524            QueryError: If the query to the network fails or is invalid.
12525        """
12526
12527        return self.query(
12528            "RegistrationsPerBlock",
12529        )
12530
12531    def max_registrations_per_block(self, netuid: int = 0):
12532        """
12533        Queries the network for the maximum number of registrations per block.
12534
12535        Retrieves the upper limit of registrations that can be processed in
12536        each block within a specific network subnet.
12537
12538        Args:
12539            netuid: The network UID for which to query.
12540
12541        Returns:
12542            The maximum number of registrations per block for
12543            the specified network subnet.
12544
12545        Raises:
12546            QueryError: If the query to the network fails or is invalid.
12547        """
12548
12549        return self.query(
12550            "MaxRegistrationsPerBlock",
12551            params=[netuid],
12552        )
12553
12554    def get_proposal(self, proposal_id: int = 0):
12555        """
12556        Queries the network for a specific proposal.
12557
12558        Args:
12559            proposal_id: The ID of the proposal to query.
12560
12561        Returns:
12562            The details of the specified proposal.
12563
12564        Raises:
12565            QueryError: If the query to the network fails, is invalid,
12566                or if the proposal ID does not exist.
12567        """
12568
12569        return self.query(
12570            "Proposals",
12571            params=[proposal_id],
12572        )
12573
12574    def get_trust(self, netuid: int = 0):
12575        """
12576        Queries the network for the trust setting of a specific network subnet.
12577
12578        Retrieves the trust level or score, which may represent the
12579        level of trustworthiness or reliability within a
12580        particular network subnet.
12581
12582        Args:
12583            netuid: The network UID for which to query the trust setting.
12584
12585        Returns:
12586            The trust level or score for the specified network subnet.
12587
12588        Raises:
12589            QueryError: If the query to the network fails or is invalid.
12590        """
12591
12592        return self.query(
12593            "Trust",
12594            params=[netuid],
12595        )
12596
12597    def get_uids(self, key: Ss58Address, netuid: int = 0) -> bool | None:
12598        """
12599        Queries the network for module UIDs associated with a specific key.
12600
12601        Args:
12602            key: The key address for which to query UIDs.
12603            netuid: The network UID within which to search for the key.
12604
12605        Returns:
12606            A list of UIDs associated with the specified key.
12607
12608        Raises:
12609            QueryError: If the query to the network fails or is invalid.
12610        """
12611
12612        return self.query(
12613            "Uids",
12614            params=[netuid, key],
12615        )
12616
12617    def get_unit_emission(self) -> int:
12618        """
12619        Queries the network for the unit emission setting.
12620
12621        Retrieves the unit emission value, which represents the
12622        emission rate or quantity for the $COMAI token.
12623
12624        Returns:
12625            The unit emission value in nanos for the network.
12626
12627        Raises:
12628            QueryError: If the query to the network fails or is invalid.
12629        """
12630
12631        return self.query("UnitEmission")
12632
12633    def get_tx_rate_limit(self) -> int:
12634        """
12635        Queries the network for the transaction rate limit.
12636
12637        Retrieves the rate limit for transactions within the network,
12638        which defines the maximum number of transactions that can be
12639        processed within a certain timeframe.
12640
12641        Returns:
12642            The transaction rate limit for the network.
12643
12644        Raises:
12645            QueryError: If the query to the network fails or is invalid.
12646        """
12647
12648        return self.query(
12649            "TxRateLimit",
12650        )
12651
12652    def get_burn_rate(self) -> int:
12653        """
12654        Queries the network for the burn rate setting.
12655
12656        Retrieves the burn rate, which represents the rate at
12657        which the $COMAI token is permanently
12658        removed or 'burned' from circulation.
12659
12660        Returns:
12661            The burn rate for the network.
12662
12663        Raises:
12664            QueryError: If the query to the network fails or is invalid.
12665        """
12666
12667        return self.query(
12668            "BurnRate",
12669            params=[],
12670        )
12671
12672    def get_burn(self, netuid: int = 0) -> int:
12673        """
12674        Queries the network for the burn setting.
12675
12676        Retrieves the burn value, which represents the amount of the
12677        $COMAI token that is 'burned' or permanently removed from
12678        circulation.
12679
12680        Args:
12681            netuid: The network UID for which to query the burn value.
12682
12683        Returns:
12684            The burn value for the specified network subnet.
12685
12686        Raises:
12687            QueryError: If the query to the network fails or is invalid.
12688        """
12689
12690        return self.query("Burn", params=[netuid])
12691
12692    def get_min_burn(self) -> int:
12693        """
12694        Queries the network for the minimum burn setting.
12695
12696        Retrieves the minimum burn value, indicating the lowest
12697        amount of the $COMAI tokens that can be 'burned' or
12698        permanently removed from circulation.
12699
12700        Returns:
12701            The minimum burn value for the network.
12702
12703        Raises:
12704            QueryError: If the query to the network fails or is invalid.
12705        """
12706
12707        return self.query(
12708            "MinBurn",
12709            params=[],
12710        )
12711
12712    def get_min_weight_stake(self) -> int:
12713        """
12714        Queries the network for the minimum weight stake setting.
12715
12716        Retrieves the minimum weight stake, which represents the lowest
12717        stake weight that is allowed for certain operations or
12718        transactions within the network.
12719
12720        Returns:
12721            The minimum weight stake for the network.
12722
12723        Raises:
12724            QueryError: If the query to the network fails or is invalid.
12725        """
12726
12727        return self.query("MinWeightStake", params=[])
12728
12729    def get_vote_mode_global(self) -> str:
12730        """
12731        Queries the network for the global vote mode setting.
12732
12733        Retrieves the global vote mode, which defines the overall voting
12734        methodology or approach used across the network in default.
12735
12736        Returns:
12737            The global vote mode setting for the network.
12738
12739        Raises:
12740            QueryError: If the query to the network fails or is invalid.
12741        """
12742
12743        return self.query(
12744            "VoteModeGlobal",
12745        )
12746
12747    def get_max_proposals(self) -> int:
12748        """
12749        Queries the network for the maximum number of proposals allowed.
12750
12751        Retrieves the upper limit on the number of proposals that can be
12752        active or considered at any given time within the network.
12753
12754        Returns:
12755            The maximum number of proposals allowed on the network.
12756
12757        Raises:
12758            QueryError: If the query to the network fails or is invalid.
12759        """
12760
12761        return self.query(
12762            "MaxProposals",
12763        )
12764
12765    def get_max_registrations_per_block(self) -> int:
12766        """
12767        Queries the network for the maximum number of registrations per block.
12768
12769        Retrieves the maximum number of registrations that can
12770        be processed in each block within the network.
12771
12772        Returns:
12773            The maximum number of registrations per block on the network.
12774
12775        Raises:
12776            QueryError: If the query to the network fails or is invalid.
12777        """
12778
12779        return self.query(
12780            "MaxRegistrationsPerBlock",
12781            params=[],
12782        )
12783
12784    def get_max_name_length(self) -> int:
12785        """
12786        Queries the network for the maximum length allowed for names.
12787
12788        Retrieves the maximum character length permitted for names
12789        within the network. Such as the module names
12790
12791        Returns:
12792            The maximum length allowed for names on the network.
12793
12794        Raises:
12795            QueryError: If the query to the network fails or is invalid.
12796        """
12797
12798        return self.query(
12799            "MaxNameLength",
12800            params=[],
12801        )
12802
12803    def get_global_vote_threshold(self) -> int:
12804        """
12805        Queries the network for the global vote threshold.
12806
12807        Retrieves the global vote threshold, which is the critical value or
12808        percentage required for decisions in the network's governance process.
12809
12810        Returns:
12811            The global vote threshold for the network.
12812
12813        Raises:
12814            QueryError: If the query to the network fails or is invalid.
12815        """
12816
12817        return self.query(
12818            "GlobalVoteThreshold",
12819        )
12820
12821    def get_max_allowed_subnets(self) -> int:
12822        """
12823        Queries the network for the maximum number of allowed subnets.
12824
12825        Retrieves the upper limit on the number of subnets that can
12826        be created or operated within the network.
12827
12828        Returns:
12829            The maximum number of allowed subnets on the network.
12830
12831        Raises:
12832            QueryError: If the query to the network fails or is invalid.
12833        """
12834
12835        return self.query(
12836            "MaxAllowedSubnets",
12837            params=[],
12838        )
12839
12840    def get_max_allowed_modules(self) -> int:
12841        """
12842        Queries the network for the maximum number of allowed modules.
12843
12844        Retrieves the upper limit on the number of modules that
12845        can be registered within the network.
12846
12847        Returns:
12848            The maximum number of allowed modules on the network.
12849
12850        Raises:
12851            QueryError: If the query to the network fails or is invalid.
12852        """
12853
12854        return self.query(
12855            "MaxAllowedModules",
12856            params=[],
12857        )
12858
12859    def get_min_stake(self, netuid: int = 0) -> int:
12860        """
12861        Queries the network for the minimum stake required to register a key.
12862
12863        Retrieves the minimum amount of stake necessary for
12864        registering a key within a specific network subnet.
12865
12866        Args:
12867            netuid: The network UID for which to query the minimum stake.
12868
12869        Returns:
12870            The minimum stake required for key registration in nanos.
12871
12872        Raises:
12873            QueryError: If the query to the network fails or is invalid.
12874        """
12875
12876        return self.query("MinStake", params=[netuid])
12877
12878    def get_stake(
12879        self,
12880        key: Ss58Address,
12881        netuid: int = 0,
12882    ) -> int:
12883        """
12884        Queries the network for the stake delegated with a specific key.
12885
12886        Retrieves the amount of total staked tokens
12887        delegated a specific key address
12888
12889        Args:
12890            key: The address of the key to query the stake for.
12891            netuid: The network UID from which to get the query.
12892
12893        Returns:
12894            The amount of stake held by the specified key in nanos.
12895
12896        Raises:
12897            QueryError: If the query to the network fails or is invalid.
12898        """
12899
12900        return self.query(
12901            "Stake",
12902            params=[netuid, key],
12903        )
12904
12905    def get_stakefrom(
12906        self,
12907        key_addr: Ss58Address,
12908        netuid: int = 0,
12909    ) -> dict[str, int]:
12910        """
12911        Retrieves a list of keys from which a specific key address is staked.
12912
12913        Queries the network for all the stakes received by a
12914        particular key from different sources.
12915
12916        Args:
12917            key_addr: The address of the key to query stakes from.
12918
12919            netuid: The network UID from which to get the query.
12920
12921        Returns:
12922            A dictionary mapping key addresses to the amount of stake
12923            received from each.
12924
12925        Raises:
12926            QueryError: If the query to the network fails or is invalid.
12927        """
12928        result = self.query("StakeFrom", [netuid, key_addr])
12929
12930        return {k: v for k, v in result}
12931
12932    def get_staketo(
12933        self,
12934        key_addr: Ss58Address,
12935        netuid: int = 0,
12936    ) -> dict[str, int]:
12937        """
12938        Retrieves a list of keys to which a specific key address stakes to.
12939
12940        Queries the network for all the stakes made by a particular key to
12941        different destinations.
12942
12943        Args:
12944            key_addr: The address of the key to query stakes to.
12945
12946            netuid: The network UID from which to get the query.
12947
12948        Returns:
12949            A dictionary mapping key addresses to the
12950            amount of stake given to each.
12951
12952        Raises:
12953            QueryError: If the query to the network fails or is invalid.
12954        """
12955
12956        result = self.query("StakeTo", [netuid, key_addr])
12957
12958        return {k: v for k, v in result}
12959
12960    def get_balance(
12961        self,
12962        addr: Ss58Address,
12963    ) -> int:
12964        """
12965        Retrieves the balance of a specific key.
12966
12967        Args:
12968            addr: The address of the key to query the balance for.
12969
12970        Returns:
12971            The balance of the specified key.
12972
12973        Raises:
12974            QueryError: If the query to the network fails or is invalid.
12975        """
12976
12977        result = self.query("Account", module="System", params=[addr])
12978
12979        return result["data"]["free"]
12980
12981    def get_block(self, block_hash: str | None = None) -> dict[Any, Any] | None:
12982        """
12983        Retrieves information about a specific block in the network.
12984
12985        Queries the network for details about a block, such as its number,
12986        hash, and other relevant information.
12987
12988        Returns:
12989            The requested information about the block,
12990            or None if the block does not exist
12991            or the information is not available.
12992
12993        Raises:
12994            QueryError: If the query to the network fails or is invalid.
12995        """
12996
12997        with self.get_conn() as substrate:
12998            block: dict[Any, Any] | None = substrate.get_block(  # type: ignore
12999                block_hash  # type: ignore
13000            )
13001
13002        return block
13003
13004    def get_existential_deposit(self, block_hash: str | None = None) -> int:
13005        """
13006        Retrieves the existential deposit value for the network.
13007
13008        The existential deposit is the minimum balance that must be maintained
13009        in an account to prevent it from being purged. Denotated in nano units.
13010
13011        Returns:
13012            The existential deposit value in nano units.
13013        Note:
13014            The value returned is a fixed value defined in the
13015            client and may not reflect changes in the network's configuration.
13016        """
13017
13018        with self.get_conn() as substrate:
13019            result: int = substrate.get_constant(  #  type: ignore
13020                "Balances", "ExistentialDeposit", block_hash
13021            ).value  #  type: ignore
13022
13023        return result
13024
13025
13026---
13027File: /validator-api/validator_api/communex/errors.py
13028---
13029
13030class ChainTransactionError(Exception):
13031    """Error for any chain transaction related errors."""
13032
13033
13034class NetworkError(BaseException):
13035    """Base for any network related errors."""
13036
13037
13038class NetworkQueryError(NetworkError):
13039    """Network query related error."""
13040
13041
13042class NetworkTimeoutError(NetworkError):
13043    """Timeout error"""
13044
13045
13046---
13047File: /validator-api/validator_api/communex/key.py
13048---
13049
13050from typing import TypeGuard
13051
13052from substrateinterface import Keypair  # type: ignore
13053from substrateinterface.utils import ss58  # type: ignore
13054
13055from validator_api.communex.types import Ss58Address
13056
13057
13058def is_ss58_address(address: str, ss58_format: int = 42) -> TypeGuard[Ss58Address]:
13059    """
13060    Validates whether the given string is a valid SS58 address.
13061
13062    Args:
13063        address: The string to validate.
13064        ss58_format: The SS58 format code to validate against.
13065
13066    Returns:
13067        True if the address is valid, False otherwise.
13068    """
13069
13070    return ss58.is_valid_ss58_address(address, valid_ss58_format=ss58_format)
13071
13072
13073def check_ss58_address(address: str | Ss58Address, ss58_format: int = 42) -> Ss58Address:
13074    """
13075    Validates whether the given string is a valid SS58 address.
13076
13077    Args:
13078        address: The string to validate.
13079        ss58_format: The SS58 format code to validate against.
13080
13081    Returns:
13082        The validated SS58 address.
13083
13084    Raises:
13085        AssertionError: If the address is invalid.
13086    """
13087
13088    assert is_ss58_address(
13089        address, ss58_format), f"Invalid SS58 address '{address}'"
13090    return Ss58Address(address)
13091
13092
13093def generate_keypair() -> Keypair:
13094    """
13095    Generates a new keypair.
13096    """
13097    mnemonic = Keypair.generate_mnemonic()
13098    keypair = Keypair.create_from_mnemonic(mnemonic)
13099    return keypair
13100
13101
13102---
13103File: /validator-api/validator_api/communex/types.py
13104---
13105
13106"""
13107Common types for the communex module.
13108"""
13109
13110from typing import NewType, TypedDict
13111
13112Ss58Address = NewType("Ss58Address", str)
13113"""Substrate SS58 address.
13114
13115The `SS58 encoded address format`_ is based on the Bitcoin Base58Check format,
13116but with a few modification specifically designed to suite Substrate-based
13117chains.
13118
13119.. _SS58 encoded address format:
13120    https://docs.substrate.io/reference/address-formats/
13121"""
13122
13123
13124# TODO: replace with dataclasses
13125
13126
13127class NetworkParams(TypedDict):
13128    max_allowed_modules: int
13129    max_registrations_per_block: int
13130    target_registrations_interval: int  #  in blocks
13131    target_registrations_per_interval: int
13132    unit_emission: int
13133    max_name_length: int
13134    min_name_length: int
13135    burn_rate: int
13136    min_burn: int  # min burn to register
13137    max_burn: int  # max burn to register
13138    min_stake: int
13139    min_weight_stake: int
13140    max_allowed_subnets: int
13141    adjustment_alpha: int
13142    floor_delegation_fee: int
13143    max_allowed_weights: int
13144    curator: Ss58Address
13145    proposal_cost: int
13146    proposal_expiration: int
13147    proposal_participation_threshold: int
13148    subnet_stake_threshold: int
13149
13150
13151class SubnetParams(TypedDict):
13152    founder: Ss58Address
13153    founder_share: int
13154    immunity_period: int
13155    incentive_ratio: int
13156    max_allowed_uids: int
13157    max_allowed_weights: int
13158    min_allowed_weights: int
13159    max_stake: int
13160    max_weight_age: int
13161    min_stake: int
13162    name: str
13163    tempo: int
13164    trust_ratio: int
13165    vote_mode: str
13166    bonds_ma: int | None
13167    maximum_set_weight_calls_per_epoch: int | None
13168
13169
13170# redundant "TypedDict" inheritance because of pdoc warns.
13171# see https://github.com/mitmproxy/pdoc/blob/26d40827ddbe1658e8ac46cd092f17a44cf0287b/pdoc/doc.py#L691-L692
13172class SubnetParamsWithEmission(SubnetParams, TypedDict):
13173    """SubnetParams with emission field.
13174    """
13175    emission: int
13176    """Subnet emission percentage (0-100).
13177    """
13178
13179
13180class ModuleInfo(TypedDict):
13181    uid: int
13182    key: Ss58Address
13183    name: str
13184    address: str  # "<ip>:<port>"
13185    emission: int
13186    incentive: int
13187    dividends: int
13188    stake_from: list[tuple[str, int]]  # TODO: type key with Ss58Address
13189    regblock: int  # block number
13190    last_update: int  # block number
13191    stake: int
13192    delegation_fee: int
13193    metadata: str
13194
13195
13196class ModuleInfoWithBalance(ModuleInfo):
13197    balance: int
13198
13199
13200class ModuleInfoWithOptionalBalance(ModuleInfo):
13201    balance: int | None
13202
13203
13204---
13205File: /validator-api/validator_api/cron/confirm_purchase.py
13206---
13207
13208import asyncio
13209import time
13210from datetime import datetime
13211import validator_api.config as config
13212
13213from sqlalchemy.orm import Session
13214
13215from validator_api.database import get_db_context
13216from validator_api.database.models.focus_video_record import FocusVideoRecord, FocusVideoStateInternal
13217
13218import bittensor as bt
13219
13220from validator_api.utils.wallet import get_transaction_from_block_hash
13221
13222def extrinsic_already_confirmed(db: Session, extrinsic_id: str) -> bool:
13223    record = db.query(FocusVideoRecord).filter(FocusVideoRecord.extrinsic_id == extrinsic_id)
13224    return record.first() is not None
13225
13226async def check_payment(db: Session, recipient_address: str, sender_address: str, amount: float, block_hash: str = None):
13227    try:
13228        print(f"Checking payment of {amount} from {sender_address} to {recipient_address}")
13229
13230        sub = bt.subtensor(network=config.NETWORK)
13231
13232        # Get all transfers associated with the recipient address
13233        transfers = await get_transaction_from_block_hash(sub, recipient_address, block_hash)
13234
13235        # Filter transfers to find the specific payment
13236        for transfer in transfers:
13237            if (
13238                transfer["from"] == sender_address and
13239                transfer["to"] == recipient_address and
13240                round(float(transfer["amount"]), 5) == round(amount, 5)
13241            ):
13242                if extrinsic_already_confirmed(db, transfer["extrinsicId"]):
13243                    continue
13244                print(f"Payment of {amount} found from {sender_address} to {recipient_address}")
13245                return transfer["extrinsicId"]
13246
13247        print(f"Payment of {amount} not found from {sender_address} to {recipient_address}")
13248        return None
13249
13250    except Exception as e:
13251        print(f'Error in checking payment: {e}')
13252        return None
13253
13254    finally:
13255        sub.close()
13256
13257SUBTENSOR_RETRIES = 5
13258SUBTENSOR_DELAY_SECS = 2
13259
13260async def confirm_transfer(
13261    db: Session,
13262    video_owner_coldkey: str,
13263    video_id: str,
13264    miner_hotkey: str,
13265    block_hash: str = None,
13266    with_lock: bool = False
13267):
13268    subtensor = bt.subtensor(network=config.NETWORK)
13269
13270    video = db.query(FocusVideoRecord).filter(
13271        FocusVideoRecord.video_id == video_id,
13272        FocusVideoRecord.processing_state == FocusVideoStateInternal.PURCHASE_PENDING,
13273        FocusVideoRecord.miner_hotkey == miner_hotkey,
13274        FocusVideoRecord.deleted_at.is_(None),
13275    )
13276    if with_lock:
13277        video = video.with_for_update()
13278    video = video.first()
13279
13280    if not video:
13281        print(f"Video <{video_id}> not found")
13282        return False
13283    
13284    amount = video.expected_reward_tao
13285
13286    current_time = datetime.utcnow()
13287    print(f"[{current_time}] | Scanning block hash <{block_hash}> for address <{video_owner_coldkey}> payment transaction from  ...")    
13288    for attempt in range(SUBTENSOR_RETRIES):
13289        try:
13290            miner_coldkey = subtensor.get_hotkey_owner(miner_hotkey)
13291            print(f"Miner coldkey: {miner_coldkey}")
13292            
13293            extrinsic_id = await check_payment(db, video_owner_coldkey, miner_coldkey, amount, block_hash)
13294            if extrinsic_id is not None:
13295                print(f"Miner <{miner_hotkey}> successfully purchased focus recording <{video_id}>!")
13296                video.miner_hotkey = miner_hotkey
13297                video.processing_state = FocusVideoStateInternal.PURCHASED
13298                video.updated_at = datetime.utcnow()
13299                video.extrinsic_id = extrinsic_id
13300                video.earned_reward_tao = amount
13301                db.add(video)
13302                db.commit()
13303                return True
13304
13305        except Exception as e:
13306            if attempt < SUBTENSOR_RETRIES - 1:  # if it's not the last attempt
13307                if "Broken pipe" in str(e) or "EOF occurred in violation of protocol" in str(e) or "[SSL: BAD_LENGTH]" in str(e):
13308                    print(f"Connection to subtensor was lost. Re-initializing subtensor and retrying in {SUBTENSOR_DELAY_SECS} seconds...")
13309                    subtensor = bt.subtensor(network=config.NETWORK)
13310                    await asyncio.sleep(SUBTENSOR_DELAY_SECS)
13311                else:
13312                    print(f"Attempt #{attempt + 1} to sub.get_hotkey_owner() and check_payment() failed. Retrying in {SUBTENSOR_DELAY_SECS} seconds...")
13313                    print(f"Error: {str(e)}")
13314                    await asyncio.sleep(SUBTENSOR_DELAY_SECS)
13315            else:
13316                print(f"All {SUBTENSOR_RETRIES} attempts failed. Unable to retrieve miner coldkey and confirm payment.")
13317                print(f"Final error: {str(e)}")
13318                return False
13319    # we got here because we could not confirm the payment. Let's return false to let the miner know
13320    return False
13321
13322
13323DELAY_SECS = 30  # 30s
13324RETRIES = 6  # 30s x 10 retries = 180s = 3 mins
13325
13326async def confirm_video_purchased(
13327    video_id: str,
13328    with_lock: bool = False
13329):
13330    """
13331    The purpose of this function is to set the video back to the SUBMITTED state 
13332    if the miner has not confirmed the purchase in time.
13333    """
13334
13335    current_time = datetime.utcnow()
13336    print(f"BACKGROUND TASK | {current_time} | Checking if video_id <{video_id}> has been marked as purchased or reverted back to SUBMITTED ...")
13337    try:
13338        for i in range(0, RETRIES):
13339            await asyncio.sleep(DELAY_SECS)
13340            try:
13341                with get_db_context() as db:
13342                    video = db.query(FocusVideoRecord).filter(
13343                        FocusVideoRecord.video_id == video_id,
13344                        FocusVideoRecord.deleted_at.is_(None),
13345                    )
13346                    if with_lock:
13347                        video = video.with_for_update()
13348                    video = video.first()
13349
13350                    if not video:
13351                        print(f"Video <{video_id}> not found")
13352                        return False
13353                    
13354                    if video is not None and video.processing_state == FocusVideoStateInternal.PURCHASED:
13355                        print(f"Video <{video_id}> has been marked as PURCHASED. Stopping background task.")
13356                        return True
13357                    elif video is not None and video.processing_state == FocusVideoStateInternal.SUBMITTED:
13358                        print(f"Video <{video_id}> has been marked as SUBMITTED. Stopping background task.")
13359                        return True
13360
13361                    print(f"Video <{video_id}> has NOT been marked as PURCHASED. Retrying in {DELAY_SECS} seconds...")
13362                    # close the db connection until next retry
13363                    db.close()
13364
13365            except Exception as e:
13366                print(f"Error in checking confirm_video_purchased loop: {e}")
13367
13368        # we got here because we could not confirm the payment in time, so we need to revert
13369        # the video back to the SUBMITTED state (i.e. mark available for purchase)
13370        print(f"Video <{video_id}> has NOT been marked as PURCHASED. Reverting to SUBMITTED state...")
13371        video.processing_state = FocusVideoStateInternal.SUBMITTED
13372        video.updated_at = datetime.utcnow()
13373        db.add(video)
13374        db.commit()
13375        db.close()
13376        return False
13377
13378    except Exception as e:
13379        print(f"Error in confirm_video_purchased: {e}")
13380
13381    return False
13382
13383
13384
13385---
13386File: /validator-api/validator_api/database/crud/focusvideo.py
13387---
13388
13389from datetime import datetime, timedelta
13390from fastapi import HTTPException
13391from sqlalchemy.orm import Session, joinedload
13392from sqlalchemy import func, Float
13393from typing import List, Optional, Dict
13394import json
13395import time
13396import asyncio
13397
13398from validator_api.database import get_db_context
13399from validator_api.database.models.focus_video_record import FocusVideoRecord, FocusVideoInternal, FocusVideoStateInternal, TaskType
13400from validator_api.database.models.user import UserRecord
13401from validator_api.utils.marketplace import get_max_focus_tao, get_purchase_max_focus_tao, get_max_focus_points_available_today
13402from pydantic import BaseModel
13403from validator_api.services.scoring_service import VideoScore, FocusVideoEmbeddings
13404
13405
13406MIN_REWARD_TAO = 0.001
13407
13408
13409class CachedValue:
13410    def __init__(self, duration: int = 90):
13411        self._value = None
13412        self._timestamp = 0
13413        self._duration = duration
13414        self._mutex = asyncio.Lock()
13415
13416    def is_valid(self) -> bool:
13417        return (
13418            self._value is not None and 
13419            time.time() - self._timestamp < self._duration
13420        )
13421
13422    async def get_or_update(self, fetch_func):
13423        if self.is_valid():
13424            return self._value
13425
13426        try:
13427            async with self._mutex:
13428                # Double check after acquiring lock
13429                if not self.is_valid():
13430                    self._value = await fetch_func()
13431                    self._timestamp = time.time()
13432            return self._value
13433
13434        except Exception as e:
13435            print(e)
13436            raise HTTPException(500, detail="Internal error")
13437
13438async def _fetch_available_focus(db: Session):
13439    # Show oldest videos first so they get rewarded fastest
13440    items = db.query(FocusVideoRecord).filter(
13441        FocusVideoRecord.processing_state == FocusVideoStateInternal.SUBMITTED,
13442        FocusVideoRecord.deleted_at.is_(None),
13443        FocusVideoRecord.expected_reward_tao > MIN_REWARD_TAO,
13444    ).order_by(FocusVideoRecord.updated_at.asc()).limit(10).all()
13445    return [FocusVideoInternal.model_validate(record) for record in items]
13446
13447_available_focus_cache = CachedValue()
13448
13449async def get_all_available_focus(db: Session):
13450    return await _available_focus_cache.get_or_update(
13451        lambda: _fetch_available_focus(db)
13452    )
13453
13454def get_pending_focus(
13455    db: Session,
13456    miner_hotkey: str
13457):
13458    try:
13459        items = db.query(FocusVideoRecord).filter_by(
13460            processing_state=FocusVideoStateInternal.PURCHASE_PENDING,
13461            miner_hotkey=miner_hotkey
13462        ).all()
13463        return items
13464    
13465    except Exception as e:
13466        print(e)
13467        raise HTTPException(500, detail="Internal error")
13468    
13469async def check_availability(
13470    db: Session,
13471    video_id: str,
13472    miner_hotkey: str,
13473    with_lock: bool = False
13474):
13475    try:
13476        video_record = db.query(FocusVideoRecord).filter(
13477            FocusVideoRecord.video_id == video_id,
13478            FocusVideoRecord.deleted_at.is_(None),
13479            FocusVideoRecord.processing_state == FocusVideoStateInternal.SUBMITTED,  # is available for purchase
13480            FocusVideoRecord.expected_reward_tao > MIN_REWARD_TAO,
13481        )
13482        if with_lock:
13483            video_record = video_record.with_for_update()
13484        video_record = video_record.first()
13485
13486        if video_record is None:
13487            return {
13488                'status': 'error',
13489                'message': f'video {video_id} not found or not available for purchase'
13490            }
13491
13492        if video_record.expected_reward_tao is None:
13493            raise HTTPException(500, detail="The video record is missing the expected reward tao, investigate this bug")
13494
13495        # mark the purchase as pending i.e. a miner has claimed the video for purchase and now just needs to pay
13496        video_record.processing_state = FocusVideoStateInternal.PURCHASE_PENDING
13497        video_record.miner_hotkey = miner_hotkey
13498        video_record.updated_at = datetime.utcnow()
13499
13500        # NOTE: we don't set the video_record.earned_reward_tao here, because we don't know if the
13501        # miner will successfully purchase the video or not. We set it later in cron/confirm_purchase.py
13502
13503        db.add(video_record)
13504        db.commit()
13505
13506        return {
13507            'status': 'success',
13508            'price': video_record.expected_reward_tao
13509        }
13510
13511    except Exception as e:
13512        print(e)
13513        raise HTTPException(500, detail="Internal error")
13514
13515def get_purchased_list(
13516    db: Session,
13517    miner_hotkey: str
13518):
13519    try:
13520        purchased_list = db.query(FocusVideoRecord).filter_by(
13521            processing_state=FocusVideoStateInternal.PURCHASED,
13522            miner_hotkey=miner_hotkey
13523        ).all()
13524        
13525        # result = [
13526        #     {
13527        #         "id": video.id,
13528        #         "task_id": video.task_id,
13529        #         "link": video.link,
13530        #         "score": video.score,
13531        #         "creator": video.creator,
13532        #         "miner_uid": video.miner_uid,
13533        #         "miner_hotkey": video.miner_hotkey,
13534        #         "estimated_tao": video.estimated_tao,
13535        #         "reward_tao": video.reward_tao,
13536        #         "status": video.status,
13537        #         "created_at": video.created_at,
13538        #         "task_str": video.task.focusing_task if video.task else None
13539        #     }
13540        #     for video in purchased_list
13541        # ]
13542
13543        # FV TODO: again, what is this for????
13544        # for video in purchased_list:
13545        #     task = get_task(db, video.task_id)
13546        #     video.task_str = task.focusing_task
13547            
13548        return purchased_list
13549    except Exception as e:
13550        print(e)
13551        # raise HTTPException(500, detail="Internal error")
13552        return []
13553
13554# def get_consumed_list(
13555#     db: Session,
13556#     miner_hotkey: str
13557# ):
13558#     try:
13559#         list = db.query(FocusVideoRecord).filter_by(
13560#             processing_state=FocusVideoStateInternal.CONSUMED,
13561#             miner_hotkey=miner_hotkey
13562#         ).all()
13563        
13564#         return list
13565#     except Exception as e:
13566#         print(e)
13567#         # raise HTTPException(500, detail="Internal error")
13568#         return []
13569
13570async def check_video_metadata(
13571    db: Session,
13572    video_id: str,
13573    user_email: str,
13574    miner_hotkey: str
13575):
13576    try:
13577        video_info = db.query(FocusVideoRecord).filter(
13578            FocusVideoRecord.video_id == video_id,
13579            FocusVideoRecord.user_email == user_email,
13580            FocusVideoRecord.miner_hotkey == miner_hotkey,
13581            FocusVideoRecord.deleted_at.is_(None)
13582        ).first()
13583
13584        if video_info is not None and video_info.processing_state == FocusVideoStateInternal.PURCHASED:
13585
13586            # # FV TODO: why do we need the task info?
13587            # task_info = db.query(models.Task).filter_by(id=video_info.task_id).first()
13588
13589            # if task_info is not None:
13590            #     video_info.status = FocusVideoEnum.Submitted
13591            #     db.add(video_info)
13592            #     db.commit()
13593            #     video_score = await score.score_video(task_info.focusing_task, task_info.clip_link)
13594            #     print(f"Video score: {video_score}")
13595            #     return {
13596            #         'success': True,
13597            #         'score': video_score
13598            #     }
13599            
13600            # return {
13601            #     'success': False,
13602            #     'message': 'No task found.'
13603            # }
13604
13605            # video_info.processing_state = FocusVideoStateInternal.VALIDATING
13606            db.add(video_info)
13607            db.commit()
13608
13609            # video_score = await score.score_video(task_info.focusing_task, task_info.clip_link)
13610            # print(f"Video score: {video_score}")
13611            video_score = video_info.video_score
13612
13613            return {
13614                'success': True,
13615                'score': video_score
13616            }
13617
13618        return {
13619            'success': False,
13620            'message': 'No video found.'
13621        }
13622
13623    except Exception as e:
13624        print(e)
13625        return {
13626            'success': False,
13627            'message': 'Internal Server Errror'
13628        }
13629
13630# async def consume_video(db: Session, video_ids: str):
13631#     print(f"Consuming focus video: <{video_ids}>")
13632#     try:
13633#         videos = db.query(FocusVideoRecord).filter(
13634#             FocusVideoRecord.video_id.in_(video_ids)
13635#         ).all()
13636#         if len(videos) > 0:
13637#             for video in videos:
13638#                 if video.processing_state == FocusVideoStateInternal.CONSUMED:
13639#                     return {
13640#                         'success': False,
13641#                         'message': 'Already consumed.'
13642#                     }
13643#                 video.processing_state = FocusVideoStateInternal.CONSUMED
13644#                 db.add(video)
13645#             db.commit()
13646#             return {
13647#                 'success': True
13648#             }
13649#         else:
13650#             return {
13651#                 'success': False,
13652#                 'message': 'No Video Found'
13653#             }
13654#     except Exception as e:
13655#         print(e)
13656#         return {
13657#             'success': False,
13658#             'message': 'Internal Server Error'
13659#         }
13660
13661# def add_task_str(db:Session, video: any):
13662#     task = get_task(db, video.task_id)
13663#     video.task_str = task.focusing_task
13664#     return video
13665
13666def get_video_owner_coldkey(db: Session, video_id: str) -> str:
13667    video_record = db.query(FocusVideoRecord).filter(
13668        FocusVideoRecord.video_id == video_id,
13669        FocusVideoRecord.deleted_at.is_(None)
13670    )
13671    video_record = video_record.first()
13672
13673    if video_record is None:
13674        raise HTTPException(404, detail="Focus video not found")
13675
13676    user_record = db.query(UserRecord).filter(UserRecord.email == video_record.user_email,).first()
13677    if user_record is None:
13678        raise HTTPException(404, detail="User not found")
13679
13680    return user_record.coldkey
13681
13682_already_purchased_cache = CachedValue()
13683
13684async def _already_purchased_max_focus_tao() -> bool:
13685    with get_db_context() as db:
13686        effective_max_focus_tao = await get_purchase_max_focus_tao()
13687        total_earned_tao = db.query(func.sum(FocusVideoRecord.earned_reward_tao)).filter(
13688            FocusVideoRecord.processing_state == FocusVideoStateInternal.PURCHASED,
13689            FocusVideoRecord.updated_at >= datetime.utcnow() - timedelta(hours=24)
13690        ).scalar() or 0
13691        return total_earned_tao >= effective_max_focus_tao
13692
13693async def already_purchased_max_focus_tao() -> bool:
13694    return await _already_purchased_cache.get_or_update(
13695        lambda: _already_purchased_max_focus_tao()
13696    )
13697
13698class MinerPurchaseStats(BaseModel):
13699    purchased_videos: List[FocusVideoInternal]
13700    total_focus_points: float
13701    max_focus_points: float
13702    focus_points_percentage: float
13703
13704async def get_miner_purchase_stats(db: Session, miner_hotkey: str) -> MinerPurchaseStats:
13705    # Get videos purchased by miner in the last 24 hours
13706    purchased_videos_records = db.query(FocusVideoRecord).filter(
13707        FocusVideoRecord.miner_hotkey == miner_hotkey,
13708        FocusVideoRecord.processing_state == FocusVideoStateInternal.PURCHASED,
13709        FocusVideoRecord.updated_at >= datetime.utcnow() - timedelta(hours=24)
13710    )
13711    purchased_videos_records = purchased_videos_records.all()
13712    
13713    purchased_videos = [
13714        FocusVideoInternal.model_validate(video_record)
13715        for video_record in purchased_videos_records
13716    ]
13717
13718    # Calculate total score for purchased videos (focus points = score * 100)
13719    total_focus_points = sum(video.video_score * 100 for video in purchased_videos)
13720
13721    # Calculate percentage
13722    max_focus_tao = await get_max_focus_tao()
13723    max_focus_points = get_max_focus_points_available_today(max_focus_tao)
13724    focus_points_percentage = total_focus_points / max_focus_points if max_focus_points > 0 else 0
13725
13726    return MinerPurchaseStats(
13727        purchased_videos=purchased_videos,
13728        total_focus_points=total_focus_points,
13729        max_focus_points=max_focus_points,
13730        focus_points_percentage=focus_points_percentage
13731    )
13732
13733def set_focus_video_score(db: Session, video_id: str, score_details: VideoScore, embeddings: FocusVideoEmbeddings):
13734    video_record = db.query(FocusVideoRecord).filter(
13735        FocusVideoRecord.video_id == video_id,
13736        FocusVideoRecord.deleted_at.is_(None)
13737    ).first()
13738    if video_record is None:
13739        raise HTTPException(404, detail="Focus video not found")
13740
13741    video_record.video_score = score_details.final_score
13742    video_record.video_details = {
13743        **video_record.video_details,
13744        **json.loads(score_details.model_dump_json()),
13745    }
13746    video_record.embeddings = json.loads(embeddings.model_dump_json())
13747    video_record.processing_state = FocusVideoStateInternal.READY
13748    video_record.updated_at = datetime.utcnow()
13749    video_record.task_type = TaskType.BOOSTED if score_details.boosted_multiplier > 1.0 else TaskType.USER
13750    db.add(video_record)
13751    db.commit()
13752
13753def mark_video_rejected(
13754    db: Session,
13755    video_id: str,
13756    rejection_reason: str,
13757    score_details: Optional[VideoScore]=None,
13758    embeddings: Optional[FocusVideoEmbeddings]=None,
13759    exception_string: Optional[str]=None,
13760):
13761    video_record = db.query(FocusVideoRecord).filter(
13762        FocusVideoRecord.video_id == video_id,
13763        FocusVideoRecord.deleted_at.is_(None)
13764    ).first()
13765    if video_record is None:
13766        raise HTTPException(404, detail="Focus video not found")
13767
13768    video_details = { **video_record.video_details }
13769
13770    if score_details:
13771        video_details = {
13772            **video_details,
13773            **json.loads(score_details.model_dump_json()),
13774        }
13775
13776    if exception_string:
13777        video_details["exception"] = exception_string
13778
13779    if score_details or exception_string:
13780        video_record.video_details = video_details
13781
13782    if embeddings:
13783        video_record.embeddings = json.loads(embeddings.model_dump_json())
13784
13785    video_record.processing_state = FocusVideoStateInternal.REJECTED
13786    video_record.rejection_reason = rejection_reason
13787    db.add(video_record)
13788    db.commit()
13789
13790def mark_video_submitted(db: Session, video_id: str, with_lock: bool = False):
13791    # Mark video as "SUBMITTED" if in the "PURCHASE_PENDING" state.
13792    video_record = db.query(FocusVideoRecord).filter(
13793        FocusVideoRecord.video_id == video_id,
13794        FocusVideoRecord.processing_state == FocusVideoStateInternal.PURCHASE_PENDING,
13795        FocusVideoRecord.deleted_at.is_(None)
13796    )
13797    if with_lock:
13798        video_record = video_record.with_for_update()
13799    video_record = video_record.first()
13800    
13801    if video_record is None:
13802        raise HTTPException(404, detail="Focus video not found or not in the correct state: PURCHASE_PENDING")
13803
13804    video_record.processing_state = FocusVideoStateInternal.SUBMITTED
13805    video_record.updated_at = datetime.utcnow()
13806    db.add(video_record)
13807    db.commit()
13808
13809_focus_points_cache = CachedValue(duration=60)  # Cache for 60 seconds
13810
13811async def _fetch_focus_points(db: Session) -> Dict[TaskType, float]:
13812    results = db.query(
13813        FocusVideoRecord.task_type,
13814        func.sum(
13815            func.cast(FocusVideoRecord.video_details['duration'].astext, Float) * 
13816            FocusVideoRecord.video_score
13817        ).label('focus_points')
13818    ).filter(
13819        FocusVideoRecord.processing_state.in_([
13820            FocusVideoStateInternal.SUBMITTED,
13821            FocusVideoStateInternal.PURCHASED
13822        ]),
13823        FocusVideoRecord.created_at >= datetime.utcnow() - timedelta(hours=24)
13824    ).group_by(FocusVideoRecord.task_type).all()
13825
13826    # Initialize dict with all TaskType values set to 0
13827    focus_points = {task_type: 0 for task_type in TaskType}
13828
13829    # Update with actual results
13830    for task_type, points in results:
13831        focus_points[task_type] = points or 0
13832
13833    return focus_points
13834
13835async def get_focus_points_from_last_24_hours(db: Session) -> Dict[TaskType, float]:
13836    return await _focus_points_cache.get_or_update(
13837        lambda: _fetch_focus_points(db)
13838    )
13839
13840
13841---
13842File: /validator-api/validator_api/database/models/__init__.py
13843---
13844
13845
13846
13847
13848---
13849File: /validator-api/validator_api/database/models/boosted_task.py
13850---
13851
13852from sqlalchemy import Column, String, Float, Integer, DateTime, Boolean
13853from validator_api.database import Base
13854from datetime import datetime
13855
13856class BoostedTask(Base):
13857    __tablename__ = 'boosted_tasks'
13858
13859    id = Column(Integer, primary_key=True)
13860    created_at = Column(DateTime, nullable=False, default=datetime.utcnow)
13861    title = Column(String(1000), nullable=False)
13862    description = Column(String(1000), nullable=False)
13863    multiplier = Column(Float, nullable=False)
13864    active = Column(Boolean, nullable=False, default=True)
13865
13866
13867
13868---
13869File: /validator-api/validator_api/database/models/focus_video_record.py
13870---
13871
13872from datetime import datetime
13873import uuid
13874from typing import Optional
13875
13876from pydantic import BaseModel, ConfigDict
13877from sqlalchemy import Column, String, DateTime, Float, Enum, Integer
13878
13879from validator_api.database import Base
13880from sqlalchemy.dialects.postgresql import JSONB
13881from validator_api.config import DB_STRING_LENGTH
13882
13883import enum
13884
13885class TaskType(enum.Enum):
13886    USER = "USER"
13887    BOOSTED = "BOOSTED"
13888
13889class FocusVideoStateExternal(enum.Enum):
13890    PROCESSING = "PROCESSING"
13891    READY = "READY"
13892    REJECTED = "REJECTED"
13893    SUBMITTED = "SUBMITTED"
13894    REWARDED = "REWARDED"
13895
13896class FocusVideoStateInternal(enum.Enum):
13897    # OMEGA Focus user facing states
13898    PROCESSING = "PROCESSING"  # User has completed task, we are currently calculating their score and checking if the video is legit
13899    READY = "READY"  # Score has been calculated and task is eligible for submission
13900    REJECTED = "REJECTED"  # Turns out that the task was NOT eligible for submission, lifecycle ended here
13901    SUBMITTED = "SUBMITTED"  # User has pressed "Submit" and the task is now listed on the marketplace, for SN24 miners to buy
13902    
13903    # Miner purchase states
13904    PURCHASE_PENDING = "PURCHASE_PENDING"  # a miner has request to buy the video, and we have sent them the amount of tao that they need to send the focus user
13905    PURCHASED = "PURCHASED"  # our background cron has confirmed that the miner has bought the focus video
13906
13907    # I think that these 2 states don't even need to exist?
13908    # VALIDATING = "VALIDATING"
13909    # CONSUMED = "CONSUMED"
13910
13911def map_focus_video_state(state: FocusVideoStateInternal) -> FocusVideoStateExternal:
13912    """
13913    The first 4 states are the ones that the user sees. The last 4 states are the ones that the
13914    miner sees. All the user needs to know is whether the video has been purchased by a miner.
13915    """
13916    state_mapping = {
13917        FocusVideoStateInternal.PROCESSING: FocusVideoStateExternal.PROCESSING,
13918        FocusVideoStateInternal.READY: FocusVideoStateExternal.READY,
13919        FocusVideoStateInternal.REJECTED: FocusVideoStateExternal.REJECTED,
13920        FocusVideoStateInternal.SUBMITTED: FocusVideoStateExternal.SUBMITTED,
13921        FocusVideoStateInternal.PURCHASE_PENDING: FocusVideoStateExternal.SUBMITTED,
13922        FocusVideoStateInternal.PURCHASED: FocusVideoStateExternal.REWARDED,
13923        # FocusVideoStateInternal.VALIDATING: FocusVideoStateExternal.REWARDED,
13924        # FocusVideoStateInternal.CONSUMED: FocusVideoStateExternal.REWARDED,
13925    }
13926    if state in state_mapping:
13927        return state_mapping[state]
13928    else:
13929        raise ValueError(f"Invalid focus video state: {state}")
13930
13931class FocusVideoRecord(Base):
13932    __tablename__ = 'focus_videos'
13933
13934    video_id = Column(String(DB_STRING_LENGTH), primary_key=True, default=lambda: str(uuid.uuid4()), nullable=False)
13935    task_id = Column(String(DB_STRING_LENGTH), nullable=False)
13936    user_id = Column(String, nullable=False)
13937    user_email = Column(String, nullable=False)
13938    processing_state = Column(Enum(FocusVideoStateInternal), nullable=False, default=FocusVideoStateInternal.PROCESSING)
13939    task_type = Column(Enum(TaskType), nullable=False, default=TaskType.USER)
13940    video_score = Column(Float, nullable=True)
13941    video_details = Column(JSONB, nullable=True)
13942    embeddings = Column(JSONB, nullable=True)
13943    rejection_reason = Column(String(1000), nullable=True)
13944    expected_reward_tao = Column(Float, nullable=True)
13945    earned_reward_tao = Column(Float, nullable=True)
13946    miner_hotkey = Column(String(DB_STRING_LENGTH), nullable=True)
13947    extrinsic_id = Column(String(DB_STRING_LENGTH), nullable=True)
13948    created_at = Column(DateTime, default=datetime.utcnow)
13949    updated_at = Column(DateTime, default=datetime.utcnow, onupdate=datetime.utcnow)
13950    deleted_at = Column(DateTime, nullable=True)
13951
13952    def get_duration(self) -> float:
13953        return float(self.video_details.get("duration", 0.0))
13954
13955class FocusVideoBase(BaseModel):
13956    video_id: str
13957    task_id: str
13958    user_email: str
13959    task_type: TaskType
13960    video_score: Optional[float]
13961    rejection_reason: Optional[str]
13962    expected_reward_tao: Optional[float]
13963    earned_reward_tao: Optional[float]
13964    created_at: datetime
13965    updated_at: datetime
13966    deleted_at: Optional[datetime]
13967
13968class FocusVideoInternal(FocusVideoBase):
13969    model_config = ConfigDict(from_attributes=True)
13970
13971    processing_state: FocusVideoStateInternal
13972    miner_hotkey: Optional[str]
13973
13974
13975
13976---
13977File: /validator-api/validator_api/database/models/task.py
13978---
13979
13980from sqlalchemy import Column, String, Boolean, Float, DateTime, Integer
13981from validator_api.config import DB_STRING_LENGTH
13982from validator_api.database import Base
13983from datetime import datetime
13984from pydantic import BaseModel, ConfigDict
13985from typing import Optional
13986
13987class TaskRecordPG(Base):
13988    __tablename__ = 'tasks'
13989    id = Column(String(DB_STRING_LENGTH), primary_key=True, nullable=False)
13990    info = Column(String(DB_STRING_LENGTH))
13991    description = Column(String(DB_STRING_LENGTH))
13992    checked = Column(Boolean, default=False)
13993    date = Column(DateTime, default=datetime.utcnow)
13994    theme = Column(String(DB_STRING_LENGTH), nullable=True)
13995    score = Column(Float)
13996    user_id = Column(String(DB_STRING_LENGTH))
13997    chat_id = Column(String(DB_STRING_LENGTH), nullable=True)
13998    reason = Column(String(DB_STRING_LENGTH), nullable=True)
13999    boosted_id = Column(Integer, nullable=True)
14000
14001
14002class Task(BaseModel):
14003    model_config = ConfigDict(from_attributes=True)
14004
14005    id: str
14006    info: str
14007    description: str
14008    checked: bool
14009    date: datetime
14010    theme: Optional[str]
14011    score: float
14012    user_id: str
14013    chat_id: Optional[str]
14014    reason: Optional[str]
14015    boosted_id: Optional[int]
14016
14017
14018
14019---
14020File: /validator-api/validator_api/database/models/user.py
14021---
14022
14023from datetime import datetime
14024
14025from sqlalchemy import Column, String, Float, DateTime
14026from pydantic import BaseModel
14027
14028from validator_api.config import DB_STRING_LENGTH, DB_STRING_LENGTH_LONG
14029from validator_api.database import Base
14030
14031
14032class UserRecord(Base):
14033    __tablename__ = 'users'
14034
14035    id = Column(String, primary_key=True, nullable=False)
14036    email = Column(String(DB_STRING_LENGTH), primary_key=True, nullable=False)
14037    name = Column(String(DB_STRING_LENGTH))
14038    coldkey = Column(String(DB_STRING_LENGTH))
14039    hotkey = Column(String(DB_STRING_LENGTH))
14040    tao_balance = Column(Float)
14041    tao_check_time = Column(DateTime, nullable=True)
14042    focused_task_id = Column(String(DB_STRING_LENGTH), nullable=True)
14043    created_at = Column(DateTime, default=datetime.utcnow)
14044
14045
14046class User(BaseModel):
14047    id: str
14048    email: str
14049    name: str
14050    tao_balance: float
14051    tao_check_time: datetime
14052    focused_task_id: str
14053    created_at: datetime
14054
14055
14056class UserInternal(BaseModel):
14057    coldkey: str
14058    hotkey: str
14059
14060
14061
14062---
14063File: /validator-api/validator_api/database/__init__.py
14064---
14065
14066from validator_api import config
14067from sqlalchemy import create_engine
14068from sqlalchemy.schema import MetaData
14069from sqlalchemy.ext.declarative import declarative_base
14070from sqlalchemy.orm import sessionmaker
14071from contextlib import contextmanager
14072
14073DB_HOST = config.FOCUS_DB_HOST
14074DB_NAME = config.FOCUS_DB_NAME
14075DB_USER = config.FOCUS_DB_USER
14076DB_PASSWORD = config.FOCUS_DB_PASSWORD
14077DB_PORT = config.FOCUS_DB_PORT
14078
14079DATABASE_URL = f"postgresql+psycopg2://{DB_USER}:{DB_PASSWORD}@{DB_HOST}:{DB_PORT}/{DB_NAME}"
14080
14081engine = create_engine(
14082    DATABASE_URL,
14083    pool_size=20,  # bumped up from default of 5
14084    max_overflow=25,  # bumped up from default of 10
14085    pool_timeout=15,  # bumped down from default of 30
14086    pool_pre_ping=True,  # Good practice for most scenarios
14087    pool_recycle=3600,  # Recycle connections after 1 hour
14088)
14089SessionLocal = sessionmaker(autocommit=False, autoflush=False, bind=engine)
14090Base = declarative_base()
14091metadata = MetaData()
14092
14093def get_db():
14094    db = SessionLocal()
14095    try:
14096        yield db
14097    finally:
14098        db.close()
14099
14100def get_db_context():
14101    return contextmanager(get_db)()
14102
14103
14104
14105---
14106File: /validator-api/validator_api/database/encrypted_json.py
14107---
14108
14109from cryptography.fernet import Fernet, InvalidToken
14110import json
14111from typing import Optional, Union
14112
14113from sqlalchemy.types import TypeDecorator, LargeBinary
14114from sqlalchemy.engine.interfaces import Dialect
14115from pydantic import BaseModel
14116
14117from validator_api.config import ENCRYPTION_KEY
14118
14119
14120fernet = Fernet(ENCRYPTION_KEY)
14121
14122# Type alias for any valid JSON type, including Pydantic BaseModel
14123JSONType = Union[dict, list, str, int, float, bool, None, BaseModel]
14124
14125
14126class EncryptedJSON(TypeDecorator):
14127    # For MySQL, the default limit here is 64 kb. In the prod DB, I (Salman) set it to 4GB.
14128    impl = LargeBinary
14129
14130    def process_bind_param(self, value: Optional[JSONType], dialect: Dialect) -> Optional[bytes]:
14131        if value is not None:
14132            try:
14133                return encrypt_data(value)
14134            except (TypeError, ValueError) as e:
14135                raise ValueError(f"Error encrypting data: {str(e)}")
14136        return None
14137
14138    def process_result_value(self, value: Optional[bytes], dialect: Dialect) -> Optional[JSONType]:
14139        if value is not None:
14140            try:
14141                return decrypt_data(value)
14142            except (InvalidToken, json.JSONDecodeError) as e:
14143                raise ValueError(f"Error decrypting data: {str(e)}")
14144        return None
14145
14146
14147def encrypt_data(data: JSONType) -> bytes:
14148    try:
14149        if isinstance(data, BaseModel):
14150            data = json.loads(data.model_dump_json())
14151        return fernet.encrypt(json.dumps(data).encode())
14152    except (TypeError, ValueError) as e:
14153        raise ValueError(f"Error encoding or encrypting data: {str(e)}")
14154
14155
14156def decrypt_data(encrypted_data: bytes) -> JSONType:
14157    try:
14158        decrypted_data = fernet.decrypt(encrypted_data)
14159        return json.loads(decrypted_data.decode())
14160    except InvalidToken:
14161        raise ValueError("Invalid token or key used for decryption")
14162    except json.JSONDecodeError:
14163        raise ValueError("Decrypted data is not valid JSON")
14164
14165
14166class LargeEncryptedJSON(EncryptedJSON):
14167    impl = LargeBinary(length=4 * 1024 * 1024 * 1024 - 1)  # 4 GB - 1 byte because thats the MySQL max
14168
14169class MediumEncryptedJSON(EncryptedJSON):
14170    impl = LargeBinary(length=16 * 1024 * 1024 - 1)  # 16 MB - 1 byte (MySQL MEDIUMBLOB max size)
14171
14172def test_encrypted_json():
14173    encrypted_json_type = EncryptedJSON()
14174    
14175    class FakeModel(BaseModel):
14176        name: str
14177        value: int
14178
14179    class NestedFakeModel(BaseModel):
14180        nested: FakeModel
14181
14182    # Test with different JSON types
14183    test_cases = [
14184        {"key": "value"},  # dict
14185        ["item1", "item2"],  # list
14186        "string",  # str
14187        42,  # int
14188        3.14,  # float
14189        True,  # bool
14190        None,  # null
14191        {"nested": {"list": [1, 2, 3], "dict": {"a": 1, "b": 2}}},  # complex nested structure
14192        FakeModel(name="Test", value=123),  # Pydantic BaseModel
14193        NestedFakeModel(nested=FakeModel(name="Nested", value=456)),  # Nested Pydantic BaseModel
14194    ]
14195    
14196    for case in test_cases:
14197        # Simulate database write
14198        encrypted = encrypted_json_type.process_bind_param(case, None)
14199        
14200        # Simulate database read
14201        decrypted = encrypted_json_type.process_result_value(encrypted, None)
14202        
14203        if isinstance(case, BaseModel):
14204            assert type(case)(**decrypted) == case, f"Failed for case: {case}"
14205        else:
14206            assert decrypted == case, f"Failed for case: {case}"
14207        print(f"Success: {case}")
14208
14209
14210if __name__ == "__main__":
14211    test_encrypted_json()
14212
14213
14214
14215---
14216File: /validator-api/validator_api/database/schemas.py
14217---
14218
14219from datetime import datetime
14220import enum
14221from typing import List, Optional
14222from pydantic import BaseModel, Field
14223
14224class TaskStatusEnum(enum.Enum):
14225    Ready = 'Ready'
14226    Running = 'Running'
14227    Stopped = 'Stopped'
14228    Completed = 'Completed'
14229
14230class FocusVideoEnum(enum.Enum):
14231    Uploaded = 'Uploaded'
14232    Available = 'Available'
14233    Pending = 'Pending'
14234    Purchased = 'Purchased'
14235    Submitted = 'Submitted'
14236    Consumed = 'Consumed'
14237
14238class TaskSchema(BaseModel):
14239    focusing_task: str = Field(...)
14240    duration: float | None = None
14241    description: str | None = None
14242    checked: bool | None = None
14243    date: str | None = None
14244    clip_link: str | None = None
14245    status: str | None = None
14246    score: float | None = None
14247    event: dict | None = None
14248
14249class UserSchema(BaseModel):
14250    email: str = Field(...)
14251    password: str = Field(...)
14252    nick_name: str = Field(...)
14253
14254class UserLoginSchema(BaseModel):
14255    email: str = Field(...)
14256    password: str = Field(...)
14257    
14258class IpfsUrlSchema(BaseModel):
14259    url: str = Field(...)
14260    miner_hotkey: str = Field(...)
14261
14262class TimeSlot(BaseModel):
14263    start: str
14264    end: str
14265
14266class FocusTask(BaseModel):
14267    id: str
14268    name: str
14269    priority: str
14270    timeSlot: TimeSlot
14271    description: str
14272    steps: List[str]
14273    resources: List[str]
14274    challenges: List[str]
14275    focusTips: List[str]
14276    isCompleted: bool
14277    totalDuration: str
14278    category: Optional[str] = None
14279
14280class Metadata(BaseModel):
14281    date: str
14282    day: str
14283    lastUpdated: datetime
14284
14285class DailySchedule(BaseModel):
14286    metadata: Metadata
14287    tasks: List[FocusTask]
14288    tools: List[str]
14289
14290
14291class Link(BaseModel):
14292    url: str = Field(..., description="URL of the website")
14293    name: str = Field(..., description="Name of the website")
14294
14295class Step(BaseModel):
14296    title: str = Field(..., description="Title of the step")
14297    content: List[str] = Field(..., description="Content of the step in paragraphs")
14298    links: Optional[List[Link]] = Field(None, description="Relevant links for the step")
14299
14300class KeyPoint(BaseModel):
14301    title: str = Field(..., description="Title of the key point")
14302    details: List[str] = Field(..., description="Details of the key point")
14303    links: Optional[List[Link]] = Field(None, description="Relevant links for the key point")
14304
14305class Analysis(BaseModel):
14306    summary: str = Field(..., description="Summary of the analysis")
14307    points: List[str] = Field(..., description="Key points or recommendations")
14308    links: Optional[List[Link]] = Field(None, description="Relevant links for the analysis")
14309
14310class TextAnalysisReport(BaseModel):
14311    title: str = Field(..., description="Title of the report")
14312    introduction: str = Field(..., description="Introduction or overview of the report")
14313    steps: List[Step] = Field(..., description="Main steps of the report")
14314    keypoints: List[KeyPoint] = Field(..., description="Key points or findings")
14315    analysis: Analysis = Field(..., description="Overall analysis or conclusion")
14316    metadata: List[str] = Field(..., description="Additional metadata about the report")
14317    timestamp: str = Field(..., description="Timestamp of the report generation (ISO 8601 date string YYYY-MM-DDTHH:MM:SS-UTC)")
14318    links: Optional[List[Link]] = Field(None, description="General links for the entire report")
14319
14320class FocusTask(BaseModel):
14321    id: str
14322    name: str
14323    priority: str
14324    timeSlot: TimeSlot
14325    description: str
14326    steps: List[str]
14327    resources: List[str]
14328    challenges: List[str]
14329    focusTips: List[str]
14330    isCompleted: bool
14331    totalDuration: str
14332    category: Optional[str] = None
14333    
14334
14335
14336---
14337File: /validator-api/validator_api/services/__init__.py
14338---
14339
14340
14341
14342
14343---
14344File: /validator-api/validator_api/services/focus_scoring_prompts.py
14345---
14346
14347
14348TASK_VALUATION_CRITERIA = """The kind of tasks that we want to see:
14349- Tasks that contribute to scientific discovery or AI advancement
14350- Creative acts that result in the creation of something new
14351- Tasks that demonstrate Chain of Thought (CoT) and are useful for training AI
14352- High novelty in approach or outcome
14353- Tasks that current AI systems struggle with
14354- Videos of coding, AI research, or solving AI engineering problems
14355- Application of the scientific process, including designing and implementing experiments
14356- Tasks that seek more knowledge or demonstrate critical thinking and creation
14357- Students learning challenging new material
14358- Office workers efficiently completing assigned work
14359
14360The kind of tasks that we don't want to see:
14361- Extremely mundane, boring, or repetitive tasks
14362- Tasks that can be easily completed by existing AI systems (e.g., basic copywriting)
14363- Tasks already present in existing datasets"""
14364
14365TASK_SCORE_SYSTEM_PROMPT = f"""
14366You are an AI tasked with evaluating proposed tasks for a cryptocurrency reward system. The goal is to encourage tasks that contribute to scientific discovery, AI advancement, creativity, education, and productivity while avoiding repetitive, busywork, or unproductive tasks.
14367
14368Here are the criteria for tasks:
14369
14370{TASK_VALUATION_CRITERIA}
14371
14372You will evaluate the task based on this rubric:
14373- Relevance: How well does the task align with what we want and avoid what we don't want?
14374- Impact: How significant is the task's potential contribution to our goals?
14375- Feasibility: Is the task realistic, achievable, and well-defined?
14376- Efficiency: Does the task make good use of resources in pursuing our objectives?
14377"""
14378
14379TASK_SCORE_USER_PROMPT = """
14380Here is the task to evaluate:
14381
14382<task_description>
14383{task_overview}
14384</task_description>
14385
14386Analyze this task based on the provided criteria and rubric. Consider both positive and negative aspects, and explain your thought process thoroughly.
14387
14388Provide your reasoning for why the task is or is not a good fit for the goal. Discuss how it aligns with or deviates from the criteria for what we want and don't want. Evaluate its potential impact, feasibility, and efficiency.
14389
14390After providing your reasoning, assign a score between 0.0 and 1.0 to indicate how well the task fits our goals. Use this scale:
14391- 0.0-0.2: Poor fit, largely irrelevant or counterproductive
14392- 0.2-0.4: Weak fit, minimal contribution to the goal
14393- 0.4-0.6: Moderate fit, somewhat helpful but not ideal
14394- 0.6-0.8: Good fit, clearly contributes to the goal
14395- 0.8-1.0: Excellent fit, highly effective in achieving the goal
14396
14397Remember to adhere to the JSON schema provided.
14398"""
14399
14400DETAILED_DESCRIPTION_SYSTEM_PROMPT = """
14401You are tasked with watching a screen recording of a human performing a task and creating a detailed annotation of the process. Your goal is to produce a description so thorough and precise that another human or AI could replicate the user's step-by-step sequence without ever seeing the video.
14402
14403After watching the video, you will create an annotation following the DetailedVideoDescription schema. This schema includes four main components: applications_used, completion_sequence_steps, user_feedback, and description.
14404
14405For each component of the schema, follow these guidelines:
14406
144071. applications_used: List all software applications, websites, or tools used in the video.
14408
144092. completion_sequence_steps: Provide a highly detailed, step-by-step breakdown of the entire process. Each step should be clear, concise, and actionable. Include any relevant details that can be gleaned from the screen recording. Number each step for clarity.
14410
144113. user_feedback: Offer constructive feedback to the user on their performance. Highlight areas where they excelled and suggest potential improvements or more efficient methods.
14412
144134. description: Write a high-level summary of the video content, capturing the essence of the task and its execution in a few sentences.
14414
14415When writing your annotation, be as precise and detailed as possible. Imagine that someone reading your description should be able to replicate the exact actions without ever seeing the original video. Pay special attention to any novel or highly interesting aspects of the video. Detail such aspects more thoroughly.
14416"""
14417
14418DETAILED_DESCRIPTION_USER_PROMPT = """
14419Watch the provided video carefully, paying close attention to every action taken by the user. Take note of the applications used, the sequence of steps performed, and any notable techniques employed.
14420
14421Note that the user is completing a task that is described as follows:
14422
14423<task_description>
14424{task_overview}
14425</task_description>
14426
14427Then, write a detailed description based on the criteria outlined. Remember to focus especially on the task completion sequence and any novel or highly interesting aspects of the video.
14428
14429Remember to be thorough, clear, and precise in your annotation. Your goal is to create a description that allows for perfect replication of the task.
14430
14431Remember to adhere to the JSON schema provided.
14432"""
14433
14434VIDEO_SCORING_SYSTEM_PROMPT = f"""
14435You are an expert in evaluating task completion based on video recordings.
14436Your role is to analyze a screen recording of a user performing a task and provide a detailed breakdown of their performance, focusing on how well they completed the assigned task.
14437
14438You will be provided with:
144391. A task overview describing the assigned task.
144402. The screen recording video of the user performing the task.
144413. A detailed description of the video content.
14442
14443Your goal is to evaluate the user's performance and provide a completion score following the CompletionScore schema.
14444This schema includes a final score and a rationale.
14445
14446For each component of the schema, follow these guidelines:
14447
144481. reasoning_steps: Provide a list of logical steps you took to arrive at your final score. Each step should be prefixed with "Step X: " where X is the step number. Start by first reiterating the task overview and what some steps might look like to complete the task.
14449
144502. focus_score: Evaluate how focused the user was on completing the task, based on their actions. Score between 0.0 and 1.0.
14451
144523. educational_score: Assess how clear the user's steps are and how easy it is to follow along. Score between 0.0 and 1.0.
14453
144544. completion_score: Assess how well the user completed the task, considering their focus, distraction level, and how quickly they completed the task, relative to the task's difficulty. Score between 0.0 and 1.0.
14455
144565. creativity_score: Assess how creative the user's approach to the task was. Score between 0.0 and 1.0.
14457
144586. final_score: Calculate an overall completion score based on your evaluation. Score between 0.0 and 1.0.
14459
144607. rationale: Provide a concise explanation for the given completion score.
14461
14462Be thorough and objective in your evaluation, considering all aspects of the user's performance as described in the video description.
14463
14464Note that not all tasks are created equal. When evaluating the task completion, keep in mind the following criteria:
14465
14466{TASK_VALUATION_CRITERIA}
14467
14468Prioritize higher scores for tasks and completions that align with what we want to see, and lower scores for those that align with what we don't want to see.
14469"""
14470
14471VIDEO_SCORING_USER_PROMPT = """
14472Based on the task description and video provided, please provide a completion score breakdown. Evaluate how well the user completed the assigned task, considering their focus, the novelty of their approach, and overall effectiveness.
14473
14474<task_description>
14475{task_overview}
14476<task_description>
14477{detailed_video_description_string}
14478Use the following rubric to assign the focus_score:
14479- 0.0-0.2: Poor focus, distractions completely derail the task
14480- 0.2-0.4: Weak focus, distractions meaningfully affect the task but are overcome
14481- 0.4-0.6: Moderate focus, distractions are a minor inconvenience
14482- 0.6-0.8: Good focus, little to no distractions
14483- 0.8-1.0: Excellent focus, the user is completely engrossed in the task, in a flow state
14484
14485Use the following rubric to assign the educational_score:
14486- 0.0-0.2: Poor educational quality, the user's steps are unclear or difficult to follow
14487- 0.2-0.4: Weak educational quality, the user's steps can be vageuly followed
14488- 0.4-0.6: Moderate educational quality, the user's steps are clear and easy to follow
14489- 0.6-0.8: Good educational quality, the user's steps are clear and easy to follow
14490- 0.8-1.0: Excellent educational quality, the user's steps are clear and easy to follow
14491
14492Use the following rubric to assign the creativity_score:
14493- 0.0-0.2: Poor creativity, the user's approach is unoriginal or uninteresting, not even enough to get the job done
14494- 0.2-0.4: Weak creativity, the user manages to get the job done but it's not very interesting or creative
14495- 0.4-0.6: Moderate creativity, the user's approach is original and creative
14496- 0.6-0.8: Good creativity, the user's approach is highly creative and innovative
14497- 0.8-1.0: Excellent creativity, the user's approach is groundbreaking and entirely novel
14498
14499Use the following rubric to assign the completion_score:
14500- 0.0-0.2: Poor task completion, largely irrelevant or counterproductive
14501- 0.2-0.4: Weak task completion, minimal contribution to the goal
14502- 0.4-0.6: Moderate task completion, somewhat helpful but not ideal
14503- 0.6-0.8: Good task completion, the task was diligently completed
14504- 0.8-1.0: Excellent task completion, the task was completed with high quality and efficiency
14505
14506For the final_score, use your best judgment to assign a score between 0.0 and 1.0 in light of the reasoning_steps, focus_score, educational_score, creativity_score, and completion_score.
14507
14508Remember to adhere to the JSON schema provided for the CompletionScore.
14509"""
14510
14511TASK_COMPLETION_SYSTEM_PROMPT = """
14512You are an expert in evaluating task completion based on video recordings.
14513Your role is to analyze a screen recording of a user performing a task and provide a detailed breakdown of their performance, focusing on how well they completed the assigned task.
14514Ignore the OMEGA Focus distraction notifications that may appear on the top right of the user's screen.
14515The content of these notifications should not be factored into your evaluation.
14516
14517You will be provided with:
145181. A task overview describing the assigned task.
145192. The screen recording video of the user performing the task.
145203. Detailed description of the user's actions in the video.
14521
14522Your goal is to evaluate the user's performance and provide a completion score following the CompletionScore schema.
14523This schema includes a final score and a rationale.
14524In the rationale, try to reference specific guidelines from the task overview/description to justify your score.
14525"""
14526
14527TASK_COMPLETION_USER_PROMPT = """
14528Based on the provided completion sequence steps and video provided, please provide a completion score breakdown.
14529Evaluate how well the user completed the assigned task, considering their focus and overall effectiveness.
14530Please use the task description to evaluate the user's performance, which may include specific steps needed to complete the task.
14531Ignore the OMEGA Focus distraction notifications that may appear on the top right of the user's screen.
14532EXTREMELY IMPORTANT: Again, the content of these distraction notifications should NOT be factored into your evaluation.
14533
14534This is the task overview:
14535<task_overview>
14536{task_overview}
14537</task_overview>
14538
14539This is the detailed description of the user's actions in the video, to aid you in your evaluation:
14540<completion_sequence_steps>
14541{completion_sequence_steps}
14542</completion_sequence_steps>
14543
14544If the user accomplishes the spirit of the task according to the task title, but does not complete it exactly as described according to the task description, you should still award some score (not 0.0).
14545
14546Use the following rubric to assign the completion_score:
14547- 0.0-0.2: Poor task completion, largely irrelevant or counterproductive
14548- 0.2-0.4: Weak task completion, minimal completion towards the goal
14549- 0.4-0.6: Moderate task completion, somewhat helpful but not ideal, maybe the user was distracted or did not follow the task description
14550- 0.6-0.8: Good task completion, the task was diligently completed
14551- 0.8-1.0: Excellent task completion, the task was completed with high quality and efficiency
14552"""
14553
14554BOOST_SCORING_SYSTEM_PROMPT = """
14555You are part of a system to evaluate and reward users for completing tasks.
14556You will be provided with a list of boosted tasks and their descriptions. Boosted tasks are tasks thatreceive an extra special multiplier to increase their score.
14557You will also be provided with a user's task description and a detailed video description.
14558Your current goal is to determine if the user-provided task matches any of the boosted tasks.
14559Return only the index of the boosted task that the user's task description most closely matches.
14560The user's task may or may not match any of the boosted tasks. If no match is found, return -1.
14561
14562Here are the boosted tasks:
14563{boosted_tasks}
14564"""
14565
14566BOOST_SCORING_USER_PROMPT = """
14567Here is the user's task title:
14568{focusing_task}
14569
14570Here is the detailed task description/breakdown:
14571{focusing_description}
14572"""
14573
14574
14575
14576---
14577File: /validator-api/validator_api/services/scoring_service.py
14578---
14579
14580import asyncio
14581from typing import List, Optional
14582import json
14583import random
14584import time
14585
14586from openai import AsyncOpenAI
14587from pydantic import BaseModel, Field, ValidationError
14588from sqlalchemy.orm import Session
14589import vertexai
14590from vertexai.generative_models import Part
14591from vertexai.preview import caching
14592from vertexai.preview.generative_models import (
14593    GenerativeModel, HarmCategory, HarmBlockThreshold, GenerationConfig,
14594)
14595from vertexai.vision_models import MultiModalEmbeddingModel, Video
14596from vertexai.vision_models import VideoSegmentConfig
14597from pinecone import Pinecone
14598
14599from validator_api.config import GOOGLE_PROJECT_ID, GOOGLE_LOCATION, OPENAI_API_KEY, GOOGLE_CLOUD_BUCKET_NAME, PINECONE_API_KEY
14600from validator_api.services import focus_scoring_prompts
14601from validator_api.utils import run_async, run_with_retries
14602from validator_api.database import get_db_context
14603from validator_api.database.models.focus_video_record import FocusVideoRecord, FocusVideoInternal
14604from validator_api.database.models.boosted_task import BoostedTask
14605from validator_api.database.models.task import TaskRecordPG
14606
14607from typing import Tuple, Optional
14608
14609TWO_MINUTES = 120  # in seconds
14610NINETY_MINUTES = 5400  # in seconds
14611FOCUS_VIDEO_MIN_SCORE = 0.05
14612FOCUS_VIDEO_MAX_SCORE = 1.0
14613MIN_VIDEO_UNIQUENESS_SCORE = 0.02
14614
14615def get_video_metadata(db: Session, video_id: str) -> Optional[FocusVideoInternal]:
14616    return db.query(FocusVideoRecord).filter(
14617        FocusVideoRecord.video_id == video_id,
14618        FocusVideoRecord.deleted_at.is_(None)
14619    ).first()
14620
14621async def query_pinecone(pinecone_index: Pinecone, vector: List[float]) -> float:
14622    async def _internal_async():
14623        response = await run_async(
14624            pinecone_index.query,
14625            vector=vector,
14626            top_k=1,
14627        )
14628        if len(response["matches"]) > 0:
14629            matches = response["matches"]
14630            similarity_score = matches[0]["score"]
14631            # for match in matches:
14632            #     print(f"Match:")
14633            #     print(f"  - Score: {match['score']}")
14634            #     print(f"  - ID: {match.get('id', 'N/A')}")
14635            #     print(f"  - Metadata: {match.get('metadata', {})}")
14636        else:
14637            print(f"No pinecone matches, returning 0")
14638            similarity_score = 0
14639        similarity_score = max(0.0, min(similarity_score, 1.0))
14640        return 1.0 - similarity_score
14641    return await run_with_retries(_internal_async)
14642
14643class VideoUniquenessError(Exception):
14644    pass
14645
14646class TaskScoreBreakdown(BaseModel):
14647    reasoning_steps: List[str] = Field(description="Steps of reasoning used to arrive at the final score. Before each step, write the text 'Step X: '")
14648    final_score: float = Field(ge=0, le=1, description="Final score for the task, between 0.0 and 1.0")
14649    rationale: str = Field(description="Compendious user-facing explanation for the given score")
14650
14651class DetailedVideoDescription(BaseModel):
14652    applications_used: List[str] = Field(description="List of applications used in the video for completing the task")
14653    completion_sequence_steps: List[str] = Field(description="Highly detailed step-by-step breakdown of the sequence of steps taken to complete the task")
14654    user_feedback: str = Field(description="Feedback for the user to improve their task completion skills in the future")
14655    description: str = Field(description="High-level summary description of the video content")
14656
14657class CompletionScore(BaseModel):
14658    rationale: str = Field(description="Concise description of how well the user completed the task")
14659    completion_score: float = Field(ge=0, le=1, description="Final completion score, between 0.0 and 1.0")
14660
14661class VideoScore(BaseModel):
14662    # task and video scores
14663    # task_score: float
14664    task_uniqueness_score: Optional[float]
14665    video_completion_score: float
14666    description_uniqueness_score: Optional[float]
14667    video_uniqueness_score: float
14668    boosted_multiplier: Optional[float]
14669    final_score: float
14670
14671    # metadata
14672    task_overview: str
14673    # task_score_breakdown: TaskScoreBreakdown
14674    completion_score_breakdown: CompletionScore
14675    detailed_video_description: DetailedVideoDescription
14676
14677class FocusVideoEmbeddings(BaseModel):
14678    # embeddings
14679    task_overview_embedding: Optional[List[float]]
14680    detailed_video_description_embedding: Optional[List[float]]
14681    video_embedding: List[float]
14682
14683class BoostedTaskIndex(BaseModel):
14684    index: int
14685    
14686class BoostedTaskData(BaseModel):
14687    title: str
14688    description: str
14689    multiplier: float
14690
14691def get_s3_path(video_id: str) -> str:
14692    return f"clips/{video_id}.webm"
14693
14694def get_gcs_uri(video_id: str) -> str:
14695    return f"gs://{GOOGLE_CLOUD_BUCKET_NAME}/{get_s3_path(video_id)}"
14696
14697class FocusScoringService:
14698    def __init__(self):
14699        vertexai.init(project=GOOGLE_PROJECT_ID, location=GOOGLE_LOCATION)
14700        self.model_name = "gemini-1.5-pro-001"
14701        print(f"Using model: {self.model_name}")
14702        self.safety_settings = {
14703            HarmCategory.HARM_CATEGORY_HATE_SPEECH: HarmBlockThreshold.BLOCK_ONLY_HIGH,
14704            HarmCategory.HARM_CATEGORY_HARASSMENT: HarmBlockThreshold.BLOCK_ONLY_HIGH,
14705            HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT: HarmBlockThreshold.BLOCK_ONLY_HIGH,
14706            HarmCategory.HARM_CATEGORY_SEXUALLY_EXPLICIT: HarmBlockThreshold.BLOCK_ONLY_HIGH,
14707        }
14708        self.temperature = 1.3
14709        self.openai_client = AsyncOpenAI(api_key=OPENAI_API_KEY)
14710        self.task_overview_index = Pinecone(api_key=PINECONE_API_KEY).Index("focus-task-overview-index")
14711        self.video_description_index = Pinecone(api_key=PINECONE_API_KEY).Index("focus-video-description-index")
14712        self.completion_video_index = Pinecone(api_key=PINECONE_API_KEY).Index("focus-completion-video-index")
14713        # [gemini task score, task uniqueness score, completion score, description uniqueness score, video uniqueness score]
14714        self.coefficients = [0.23, 0.16, 0.29, 0.14, 0.18]
14715
14716    # Gemini API call related functions
14717
14718    async def make_gemini_request_with_retries(self, system_prompt: str, user_prompt: str, video_id: str, OutputClassSchema: BaseModel) -> str:
14719        num_retries = 3
14720        for retry_idx in range(num_retries):
14721            try:
14722                start = time.time()
14723                output = await self.make_gemini_request(system_prompt, user_prompt, video_id, OutputClassSchema)
14724                print(f"Got gemini output in {time.time() - start} seconds for {OutputClassSchema.__name__}")
14725                return output
14726            except json.JSONDecodeError as e:
14727                print(f"Error parsing JSON from Gemini response for {OutputClassSchema.__name__}, trying again: {e} ({retry_idx + 1}/{num_retries})")
14728                await asyncio.sleep(1)
14729            except ValidationError as e:
14730                print(f"Error turning parsed JSON into Pydantic object for {OutputClassSchema.__name__}, trying again: {e} ({retry_idx + 1}/{num_retries})")
14731                await asyncio.sleep(1)
14732            except Exception as e:
14733                print(f"Error making Gemini request for {OutputClassSchema.__name__}, trying again: {e} ({retry_idx + 1}/{num_retries})")
14734                await asyncio.sleep(6)
14735        raise Exception(f"Failed to turn Gemini response into JSON and then into Pydantic object for {OutputClassSchema.__name__} after {num_retries} attempts")
14736
14737    async def make_gemini_request(self, system_prompt: str, user_prompt: str, video_id: str, OutputClassSchema: BaseModel) -> GenerativeModel:
14738        model = GenerativeModel(
14739            self.model_name,
14740            system_instruction=system_prompt.strip(),
14741            safety_settings=self.safety_settings,
14742            generation_config=GenerationConfig(
14743                temperature=self.temperature,
14744                response_mime_type="application/json",
14745                response_schema=OutputClassSchema.model_json_schema(),
14746            ),
14747        )
14748
14749        parts = []
14750        if video_id:
14751            parts.append(Part.from_uri(get_gcs_uri(video_id), mime_type="video/webm"))
14752        parts.append(user_prompt.strip())
14753
14754        response = await model.generate_content_async(parts)
14755        return OutputClassSchema(**json.loads(response.text))
14756
14757    async def get_task_score_from_gemini(self, task_overview: str) -> TaskScoreBreakdown:
14758        return await self.make_gemini_request_with_retries(
14759            system_prompt=focus_scoring_prompts.TASK_SCORE_SYSTEM_PROMPT,
14760            user_prompt=focus_scoring_prompts.TASK_SCORE_USER_PROMPT.format(task_overview=task_overview),
14761            video_id=None,
14762            OutputClassSchema=TaskScoreBreakdown,
14763        )
14764
14765    async def get_detailed_video_description(self, video_id: str, task_overview: str) -> DetailedVideoDescription:
14766        return await self.make_gemini_request_with_retries(
14767            system_prompt=focus_scoring_prompts.DETAILED_DESCRIPTION_SYSTEM_PROMPT,
14768            user_prompt=focus_scoring_prompts.DETAILED_DESCRIPTION_USER_PROMPT.format(task_overview=task_overview),
14769            video_id=video_id,
14770            OutputClassSchema=DetailedVideoDescription,
14771        )
14772
14773    async def get_completion_score_breakdown(
14774        self,
14775        video_id: str,
14776        task_overview: str,
14777        detailed_video_description: Optional[DetailedVideoDescription] = None,
14778        system_prompt: str = focus_scoring_prompts.TASK_COMPLETION_SYSTEM_PROMPT,
14779        user_prompt: str = focus_scoring_prompts.TASK_COMPLETION_USER_PROMPT,
14780    ) -> CompletionScore:
14781        """
14782        This function generates a completion score breakdown for a given video.
14783
14784        Args:
14785            video_id (str): The ID of the video to be scored.
14786            task_overview (str): An overview of the task associated with the video.
14787            detailed_video_description (Optional[DetailedVideoDescription], optional): A detailed description of the video content. Defaults to None.
14788            system_prompt (str, optional): The system prompt to be used for generating the completion score. Defaults to focus_scoring_prompts.TASK_COMPLETION_SYSTEM_PROMPT.
14789            user_prompt (str, optional): The user prompt to be used for generating the completion score. Defaults to focus_scoring_prompts.TASK_COMPLETION_USER_PROMPT.
14790
14791        Returns:
14792            CompletionScore: The completion score breakdown for the video.
14793
14794        The user_prompt should include {task_overview} and {completion_sequence_steps}.
14795        """
14796        completion_sequence_steps_string = f"""\n\n
14797Additionally, here is a detailed description of the video content that you should reference along with the video:
14798
14799<completion_sequence_steps>
14800{detailed_video_description.completion_sequence_steps}
14801</completion_sequence_steps>
14802""" if detailed_video_description else ""
14803
14804        return await self.make_gemini_request_with_retries(
14805            system_prompt=system_prompt,
14806            user_prompt=user_prompt.format(
14807                task_overview=task_overview,
14808                completion_sequence_steps=completion_sequence_steps_string,
14809            ),
14810            video_id=video_id,
14811            OutputClassSchema=CompletionScore,
14812        )
14813
14814    # Pinecone related functions
14815
14816    async def get_task_uniqueness_score(self, task_overview_embedding: List[float]) -> float:
14817        return await query_pinecone(self.task_overview_index, task_overview_embedding)
14818
14819    async def get_description_uniqueness_score(self, detailed_video_description_embedding: List[float]) -> float:
14820        return await query_pinecone(self.video_description_index, detailed_video_description_embedding)
14821
14822    async def get_video_uniqueness_score(self, video_embedding: List[float]) -> float:
14823        return await query_pinecone(self.completion_video_index, video_embedding)
14824
14825    # Embedding related functions
14826
14827    def get_video_duration_seconds(self, video_id: str) -> int:
14828        with get_db_context() as db:
14829            video_metadata = get_video_metadata(db, video_id)
14830
14831            if video_metadata is None:
14832                raise ValueError(f"Focus video not found: {video_id}")
14833
14834            video_duration_seconds = video_metadata.video_details.get("duration")
14835            if video_duration_seconds is None:
14836                print(f"Video duration not found for video: {video_id}")
14837                video_duration_seconds = 120
14838
14839            return video_duration_seconds
14840
14841    async def get_video_embedding(self, video_id: str, video_duration_seconds: int) -> List[float]:
14842        async def _internal_async():
14843            model = MultiModalEmbeddingModel.from_pretrained("multimodalembedding")
14844            start_offset_sec = random.randint(0, max(0, video_duration_seconds - 120))
14845            end_offset_sec = min(video_duration_seconds, start_offset_sec + 120)
14846            embeddings = await run_async(
14847                model.get_embeddings,
14848                video=Video.load_from_file(get_gcs_uri(video_id)),
14849                video_segment_config=VideoSegmentConfig(
14850                    start_offset_sec=start_offset_sec,
14851                    end_offset_sec=end_offset_sec,
14852                    interval_sec=end_offset_sec - start_offset_sec
14853                )
14854            )
14855            return embeddings.video_embeddings[0].embedding
14856        return await run_with_retries(_internal_async)
14857
14858    async def get_text_embedding(self, text: str) -> Optional[List[float]]:
14859        async def _internal_async():
14860            response = await asyncio.wait_for(self.openai_client.embeddings.create(
14861                input=text,
14862                model="text-embedding-3-large"
14863            ), timeout=10)
14864            return response.data[0].embedding
14865
14866        try:
14867            return await run_with_retries(_internal_async)
14868        except Exception as e:
14869            print(f"Error getting text embedding: {e}")
14870            return None
14871
14872    async def embed_and_get_task_uniqueness_score(self, task_overview: str) -> Tuple[Optional[List[float]], Optional[float]]:
14873        embedding = await self.get_text_embedding(task_overview)
14874        if embedding is None:
14875            return None, None
14876        return embedding, await self.get_task_uniqueness_score(embedding)
14877
14878    async def embed_and_get_video_uniqueness_score(self, video_id: str, video_duration_seconds: int):
14879        embedding = await self.get_video_embedding(video_id, video_duration_seconds)
14880        return embedding, await self.get_video_uniqueness_score(embedding)
14881
14882    async def get_detailed_video_description_embedding_score(self, video_id, task_overview):
14883        detailed_video_description = await self.get_detailed_video_description(video_id, task_overview)
14884        embedding = await self.get_text_embedding(detailed_video_description.model_dump_json())
14885        if embedding is None:
14886            return detailed_video_description, None, None
14887        return detailed_video_description, embedding, await self.get_description_uniqueness_score(embedding)
14888
14889    async def score_video(self, video_id: str, focusing_task: str, focusing_description: str):
14890        """
14891        The video score is a score of how well the user completed the task, based on the task overview and the detailed video description.
14892        If the video is too similar to other videos, it will be rejected.
14893        Errors raised should make the video rejected.
14894        """
14895        boosted_multiplier = 1.0
14896        with get_db_context() as db:
14897            """
14898            if the task is boosted, use the boosted task info directly
14899            """
14900            video_metadata = get_video_metadata(db, video_id)
14901            if video_metadata and video_metadata.task_id:
14902                task = db.query(TaskRecordPG).filter(
14903                    TaskRecordPG.id == video_metadata.task_id,
14904                ).first()
14905                if task:
14906                    boosted_task = db.query(BoostedTask).filter(
14907                        BoostedTask.id == task.boosted_id
14908                    ).first()
14909                    if boosted_task:
14910                        boosted_multiplier = boosted_task.multiplier
14911                        focusing_task = boosted_task.title
14912                        focusing_description = boosted_task.description
14913                        # print(f"Scoring boosted task index {boosted_task.id} multiplier {boosted_multiplier}\n\n{boosted_task.title}\n\n{boosted_task.description}")
14914                        
14915            
14916        video_duration_seconds = self.get_video_duration_seconds(video_id)
14917
14918        if video_duration_seconds < TWO_MINUTES:
14919            raise ValueError(f"Video duration is too short: {video_duration_seconds} seconds")
14920
14921        if video_duration_seconds > NINETY_MINUTES:
14922            raise ValueError(f"Video duration is too long: {video_duration_seconds} seconds")
14923
14924        task_overview = f"# {focusing_task}\n\n{focusing_description}"
14925
14926        (
14927            (task_overview_embedding, task_uniqueness_score),
14928            # task_score_breakdown,
14929            (video_description, video_description_embedding, video_description_uniqueness_score),
14930            (video_embedding, video_uniqueness_score),
14931        ) = await asyncio.gather(
14932            self.embed_and_get_task_uniqueness_score(task_overview),  # uses openai to get embedding
14933            # self.get_task_score_from_gemini(task_overview),  # uses gemini to score task
14934            self.get_detailed_video_description_embedding_score(video_id, task_overview),  # uses gemini to get detailed description
14935            self.embed_and_get_video_uniqueness_score(video_id, video_duration_seconds),
14936        )
14937        
14938        if video_uniqueness_score < MIN_VIDEO_UNIQUENESS_SCORE:
14939            raise VideoUniquenessError("Video uniqueness score is too low.")
14940        
14941        completion_score_breakdown = await self.get_completion_score_breakdown(
14942            video_id,
14943            task_overview,
14944            detailed_video_description=video_description,
14945        )
14946        
14947        completion_gemini_score = completion_score_breakdown.completion_score
14948        final_score = completion_gemini_score * boosted_multiplier
14949        
14950        print(f"Final score: {final_score}")
14951        print(f"completion score breakdown: {completion_score_breakdown}")
14952        
14953        return VideoScore(
14954            task_uniqueness_score=task_uniqueness_score,
14955            video_completion_score=completion_gemini_score,
14956            description_uniqueness_score=video_description_uniqueness_score,
14957            video_uniqueness_score=video_uniqueness_score,
14958            boosted_multiplier=boosted_multiplier,
14959            final_score=final_score,
14960            task_overview=task_overview,
14961            completion_score_breakdown=completion_score_breakdown,
14962            detailed_video_description=video_description,
14963        ), FocusVideoEmbeddings(
14964            task_overview_embedding=task_overview_embedding,
14965            detailed_video_description_embedding=video_description_embedding,
14966            video_embedding=video_embedding,
14967        )
14968
14969    # async def get_model_cached_on_video(self, video_id: str) -> GenerativeModel:
14970    #     video_part = Part.from_uri(get_gcs_uri(video_id), mime_type="video/webm")
14971    #     cached_content = caching.CachedContent.create(
14972    #         model_name=self.model_name,
14973    #         system_instruction="You are an expert video description generator. You are given a video and a task and you need to generate a detailed description of the video.",
14974    #         contents=[video_part],
14975    #         ttl=datetime.timedelta(minutes=5),
14976    #     )
14977    #     return GenerativeModel.from_cached_content(cached_content=cached_content)
14978
14979
14980def main():
14981    service = FocusScoringService()
14982    import asyncio
14983    import time
14984
14985    async def main():
14986        video_id = "29f91a6f-1393-4765-ba00-263b4cff28b6"
14987        task_overview = """
14988# Multimodal tokenization research
14989
14990Read the Show-O peper to understand how they have trained a unified diffusion and autoregressive model for multimodal tokenization.
14991""".strip()
14992
14993        score_details = await service.score_video(video_id, task_overview, "description")
14994        print(score_details)
14995
14996        # task_overview_embedding = await service.get_text_embedding(task_overview)
14997        # print(len(task_overview_embedding))
14998
14999        # detailed_video_description = DetailedVideoDescription(
15000        #     applications_used=[],
15001        #     completion_sequence_steps=[],
15002        #     user_feedback="",
15003        #     description=""
15004        # )
15005
15006        # video_embedding = await service.get_video_embedding(video_id, 1740)
15007        # print(f"Sum: {sum(video_embedding)}, min: {min(video_embedding)}, max: {max(video_embedding)}")
15008
15009        # task_score_breakdown = await service.get_task_score_from_gemini(task_overview)
15010        # print(task_score_breakdown)
15011
15012        # completion_score_breakdown = await service.get_completion_score_breakdown(video_id, task_overview, detailed_video_description=None)
15013        # print(completion_score_breakdown)
15014
15015        # start = time.time()
15016        # model = service.get_model_cached_on_video(video_id)
15017        # print(f"Got model in {time.time() - start} seconds")
15018        # for _ in range(4):
15019        #     start = time.time()
15020        #     video_description = await service.get_detailed_video_description_from_cache(model)
15021        #     print(f"Got detailed video description ({video_description}) in {time.time() - start} seconds")
15022
15023        # for _ in range(4):
15024        #     start = time.time()
15025        #     video_description = await service.get_detailed_video_description(video_id)
15026        #     print(f"Got detailed video description ({video_description}) in {time.time() - start} seconds")
15027
15028    asyncio.run(main())
15029
15030
15031if __name__ == "__main__":
15032    main()
15033
15034
15035
15036---
15037File: /validator-api/validator_api/utils/__init__.py
15038---
15039
15040import asyncio, functools
15041
15042RETRIES=3
15043DELAY_SECS=2
15044
15045def run_async(func, *args, **kwargs):
15046    loop = asyncio.get_event_loop()
15047    return loop.run_in_executor(None, functools.partial(func, *args, **kwargs))
15048
15049async def run_with_retries(func, *args, **kwargs):
15050    """ func can be sync or async, since we await the output if it's a coroutine """
15051    for i in range(0, RETRIES):
15052        try:
15053            output = func(*args, **kwargs)
15054            if asyncio.iscoroutine(output):
15055                return await output
15056            else:
15057                return output
15058        except:
15059            if i == RETRIES - 1:
15060                raise
15061            await asyncio.sleep(DELAY_SECS)
15062    raise RuntimeError("Should never happen")
15063
15064
15065
15066---
15067File: /validator-api/validator_api/utils/marketplace.py
15068---
15069
15070import time
15071from typing import Tuple, Dict
15072import requests
15073import bittensor as bt
15074from validator_api.config import (
15075    NETWORK, BT_TESTNET, NETUID, FOCUS_REWARDS_PERCENT, FIXED_TAO_USD_ESTIMATE,
15076    BOOSTED_TASKS_PERCENTAGE,
15077)
15078from validator_api.utils import run_with_retries, run_async
15079from validator_api.database.models.focus_video_record import TaskType
15080
15081TASK_TYPE_MAP = {
15082    TaskType.USER: 1 - BOOSTED_TASKS_PERCENTAGE,
15083    TaskType.BOOSTED: BOOSTED_TASKS_PERCENTAGE,
15084}
15085
15086
15087async def get_subtensor_and_metagraph() -> Tuple[bt.subtensor, bt.metagraph]:
15088
15089    def _internal() -> Tuple[bt.subtensor, bt.metagraph]:
15090        subtensor = bt.subtensor(network=NETWORK)
15091        metagraph = bt.metagraph(NETUID)
15092        return subtensor, metagraph
15093
15094    return await run_with_retries(_internal)
15095
15096
15097async def get_tao_price() -> float:
15098    return await run_with_retries(
15099        lambda: float(
15100            requests.get(
15101                "https://api.kucoin.com/api/v1/market/stats?symbol=TAO-USDT"
15102            ).json()["data"]["last"]
15103        )
15104    )
15105
15106# Global cache for max focus TAO
15107max_focus_tao_cache = {
15108    'value': None,
15109    'timestamp': 0
15110}
15111
15112CACHE_DURATION = 30 * 60  # 30 minutes in seconds
15113
15114async def get_max_focus_tao() -> float:
15115    global max_focus_tao_cache
15116    current_time = time.time()
15117
15118    # Check if cached data is still valid
15119    if max_focus_tao_cache['value'] is not None and current_time - max_focus_tao_cache['timestamp'] < CACHE_DURATION:
15120        return max_focus_tao_cache['value']
15121
15122    # If cache is invalid or empty, recalculate
15123    subtensor, metagraph = await get_subtensor_and_metagraph()
15124
15125    def _internal_sync():
15126        current_block = metagraph.block.item()
15127        metagraph.sync(current_block - 10, lite=False, subtensor=subtensor)
15128
15129        total_vali_and_miner_emission = 0
15130        for uid in metagraph.uids.tolist():
15131            total_vali_and_miner_emission += metagraph.emission[uid]
15132
15133        total_miner_emission = total_vali_and_miner_emission / 2  # per tempo
15134        total_miner_emission_per_day = total_miner_emission * 20  # 20 tempo intervals per day
15135        max_focus_tao = total_miner_emission_per_day * FOCUS_REWARDS_PERCENT
15136
15137        if NETWORK == BT_TESTNET:
15138            max_focus_tao = max(2, max_focus_tao)
15139            # max_focus_tao = max(18, max_focus_tao)  # 92 tao per day cuz 3.12% emissions * 20% budget
15140
15141        return max_focus_tao
15142
15143    async def _internal_async() -> float:
15144        return await run_async(_internal_sync)
15145
15146    max_focus_tao = await run_with_retries(_internal_async)
15147
15148    # Update cache
15149    max_focus_tao_cache['value'] = max_focus_tao
15150    max_focus_tao_cache['timestamp'] = current_time
15151
15152    return max_focus_tao
15153
15154
15155async def get_purchase_max_focus_tao() -> float:
15156    """we want to limit the amount of focus tao that can be purchased to 90% of the max focus tao so miners can make some profit"""
15157    max_focus_tao = await get_max_focus_tao()
15158    return max_focus_tao * 0.9
15159
15160
15161def get_dollars_available_today(max_focus_tao: float) -> float:
15162    """ Use a fixed TAO - USD estimate to keep consistent for the sake of miner rewards """
15163    return max_focus_tao * FIXED_TAO_USD_ESTIMATE
15164
15165def get_max_focus_points_available_today(max_focus_tao: float) -> float:
15166    # 1 point = 1 dollar
15167    return int(get_dollars_available_today(max_focus_tao))
15168
15169MAX_TASK_REWARD_TAO = 0.1
15170
15171
15172
15173---
15174File: /validator-api/validator_api/utils/wallet.py
15175---
15176
15177import bittensor as bt
15178import aiohttp
15179import time
15180from validator_api.utils import run_with_retries, run_async
15181from typing import List
15182
15183
15184# Global cache for TAO/USD rate
15185tao_usd_cache = {
15186    'rate': None,
15187    'timestamp': 0
15188}
15189
15190CACHE_DURATION = 30 * 60  # 30 minutes in seconds
15191
15192async def get_tao_usd_rate() -> float:
15193    global tao_usd_cache
15194    current_time = time.time()
15195
15196    # Check if cached data is still valid
15197    if tao_usd_cache['rate'] is not None and current_time - tao_usd_cache['timestamp'] < CACHE_DURATION:
15198        return tao_usd_cache['rate']
15199
15200    try:
15201        async with aiohttp.ClientSession() as session:
15202            async with session.get('https://taostats.io/data.json') as response:
15203                if response.status == 200:
15204                    data = await response.json()
15205                    rate = float(data[0]['price'])
15206
15207                    # Update cache
15208                    tao_usd_cache['rate'] = rate
15209                    tao_usd_cache['timestamp'] = current_time
15210
15211                    return rate
15212                else:
15213                    print(f"Failed to fetch TAO/USD rate. Status code: {response.status}")
15214                    return tao_usd_cache['rate']
15215    except Exception as e:
15216        print(f"Error fetching TAO/USD rate: {str(e)}")
15217        return tao_usd_cache['rate']
15218
15219async def check_wallet_tao_balance(wallet_key: str, subtensor_network: str) -> float:
15220    def _internal_sync() -> float:
15221        subtensor = bt.subtensor(network=subtensor_network)
15222        balance = subtensor.get_balance(wallet_key).tao
15223        return balance
15224
15225    async def _internal_async() -> float:
15226        return await run_async(_internal_sync)
15227
15228    return await run_with_retries(_internal_async)
15229
15230
15231API_URL = "https://api.subquery.network/sq/TaoStats/bittensor-indexer"
15232MAX_TXN = 50
15233GRAPHQL_QUERY = """
15234query ($first: Int!, $after: Cursor, $filter: TransferFilter, $order: [TransfersOrderBy!]!) {
15235    transfers(first: $first, after: $after, filter: $filter, orderBy: $order) {
15236        nodes {
15237            id
15238            from
15239            to
15240            amount
15241            extrinsicId
15242            blockNumber
15243        }
15244        pageInfo {
15245            endCursor
15246            hasNextPage
15247            hasPreviousPage
15248        }
15249        totalCount
15250    }
15251}
15252"""
15253
15254async def get_transaction_from_block_hash(subtensor, wallet_address: str, block_hash: str) -> List[dict]:
15255    """Get all transfers associated with the provided wallet address and block_hash."""
15256    transactions = []
15257    divisor = 1e9
15258    
15259    block = subtensor.substrate.get_block(block_hash)
15260    block_num = block['header']['number']
15261
15262    for extrinsic in block['extrinsics']:
15263        extrinsic = extrinsic.value
15264        if 'call' in extrinsic and extrinsic['call']['call_module'] == 'Balances':
15265            if extrinsic['call']['call_function'] in ['transfer', 'transfer_allow_death']:
15266                sender = extrinsic.get('address', 'Unknown')
15267                recipient = extrinsic['call']['call_args'][0]['value']
15268                amount = int(extrinsic['call']['call_args'][1]['value'])
15269
15270                if sender == wallet_address or recipient == wallet_address:
15271                    transactions.append({
15272                        'id': extrinsic['extrinsic_hash'],
15273                        'from': sender,
15274                        'to': recipient,
15275                        'amount': amount / divisor,
15276                        # the Id is not actually supposed to be the hash, but we'll let it fly
15277                        # for now cause all we need is a unique identifier, which the hash is
15278                        'extrinsicId': extrinsic['extrinsic_hash'],
15279                        'blockNumber': block_num
15280                    })
15281
15282    return transactions[::-1]
15283
15284
15285
15286---
15287File: /validator-api/validator_api/config.py
15288---
15289
15290import os
15291from dotenv import load_dotenv
15292import json
15293from typing import List
15294import boto3
15295from omega import constants
15296
15297load_dotenv(override=True)
15298
15299def get_secret(secret_name, region_name):
15300    # Create a Secrets Manager client
15301    session = boto3.session.Session()
15302    client = session.client(
15303        service_name='secretsmanager',
15304        region_name=region_name,
15305    )
15306
15307    get_secret_value_response = client.get_secret_value(
15308        SecretId=secret_name
15309    )
15310
15311    # For a list of exceptions thrown, see
15312    # https://docs.aws.amazon.com/secretsmanager/latest/apireference/API_GetSecretValue.html
15313
15314    # Decrypts secret using the associated KMS key.
15315    secret = get_secret_value_response['SecretString']
15316
15317    return secret
15318
15319def parse_proxies(proxy_list: List[str]) -> List[str]:
15320    transformed_proxies = []
15321    for proxy in proxy_list:
15322        proxy_ip, proxy_port, proxy_user, proxy_pass = proxy.split(':')
15323        transformed_proxies.append(f"http://{proxy_user}:{proxy_pass}@{proxy_ip}:{proxy_port}")
15324    return transformed_proxies
15325
15326NETWORK = os.environ["NETWORK"]
15327NETUID = int(os.environ["NETUID"])
15328
15329ENABLE_COMMUNE = True if os.environ["ENABLE_COMMUNE"] == "True" else False
15330print("Running with ENABLE_COMMUNE:", ENABLE_COMMUNE)
15331COMMUNE_NETWORK = os.environ["COMMUNE_NETWORK"]
15332COMMUNE_NETUID = int(os.environ["COMMUNE_NETUID"])
15333
15334API_KEY_NAME = "OMEGA_MM_API_KEY"
15335API_KEYS = json.loads(os.environ["API_KEYS"])
15336
15337PINECONE_API_KEY = os.environ["PINECONE_API_KEY"]
15338PINECONE_INDEX = os.environ["PINECONE_INDEX"]
15339PINECONE_AUDIO_INDEX = os.environ["PINECONE_AUDIO_INDEX"]
15340HF_TOKEN = os.environ["HF_TOKEN"]
15341HF_REPO = os.environ["HF_REPO"]
15342HF_AUDIO_REPO = os.environ["HF_AUDIO_REPO"]
15343REPO_TYPE = "dataset"
15344TOPICS_LIST = json.loads(os.environ["TOPICS_LIST"])
15345PROXY_LIST = parse_proxies(json.loads(os.environ["PROXY_LIST"]))
15346IS_PROD = os.environ.get("IS_PROD", "false").lower() == "true"
15347CHECK_PROBABILITY = float(os.environ.get("CHECK_PROBABILITY", 0.1))
15348UPLOAD_BATCH_SIZE = int(os.environ.get("UPLOAD_BATCH_SIZE", 1024))
15349UPLOAD_AUDIO_BATCH_SIZE = int(os.environ.get("UPLOAD_AUDIO_BATCH_SIZE", 256))
15350
15351DB_CONFIG = {
15352    'user': os.environ["DBUSER"],
15353    'password': os.environ["DBPASS"],
15354    'host': os.environ["DBHOST"],
15355    'database': os.environ["DBNAME"]
15356}
15357
15358# Omega Focus Constants
15359FOCUS_DB_HOST = os.environ["FOCUS_DB_HOST"]
15360FOCUS_DB_NAME = os.environ["FOCUS_DB_NAME"]
15361FOCUS_DB_USER = os.environ["FOCUS_DB_USER"]
15362FOCUS_DB_PASSWORD = os.environ["FOCUS_DB_PASSWORD"]
15363FOCUS_DB_PORT = os.getenv("FOCUS_DB_PORT", 5432)
15364DB_STRING_LENGTH = 200
15365DB_STRING_LENGTH_LONG = 500
15366ENCRYPTION_KEY = os.environ["ENCRYPTION_KEY"]
15367
15368BT_TESTNET = "test"
15369BT_MAINNET = "finney"
15370assert NETWORK in [BT_TESTNET, BT_MAINNET], "SUBTENSOR_NETWORK must be either test or finney"
15371TAO_REFRESH_INTERVAL_MINUTES = int(os.getenv('TAO_REFRESH_INTERVAL_MINUTES', 10))
15372
15373FOCUS_REWARDS_PERCENT = float(os.getenv('FOCUS_REWARDS_PERCENT', constants.FOCUS_REWARDS_PERCENT))
15374FOCUS_API_KEYS = json.loads(os.environ["FOCUS_API_KEYS"])
15375GOOGLE_AI_API_KEY = os.environ["GOOGLE_AI_API_KEY"]
15376OPENAI_API_KEY = os.environ["OPENAI_API_KEY"]
15377AWS_ACCESS_KEY_ID = os.environ["AWS_ACCESS_KEY_ID"]
15378AWS_SECRET_ACCESS_KEY = os.environ["AWS_SECRET_ACCESS_KEY"]
15379AWS_S3_REGION = os.environ["AWS_S3_REGION"]
15380AWS_S3_BUCKET_NAME = os.environ["AWS_S3_BUCKET_NAME"]
15381
15382MAX_FOCUS_POINTS_PER_HOUR = int(os.getenv("MAX_FOCUS_POINTS_PER_HOUR", 80))  # $80 / hour
15383FIXED_TAO_USD_ESTIMATE = float(os.getenv("FIXED_TAO_USD_ESTIMATE", 300.0))
15384BOOSTED_TASKS_PERCENTAGE = float(os.getenv("BOOSTED_TASKS_PERCENTAGE", 0.7))
15385
15386GOOGLE_PROJECT_ID = os.getenv("GOOGLE_PROJECT_ID")
15387GOOGLE_LOCATION = os.getenv("GOOGLE_LOCATION", "us-central1")
15388GOOGLE_APPLICATION_CREDENTIALS = os.getenv("GOOGLE_APPLICATION_CREDENTIALS")
15389GOOGLE_CLOUD_BUCKET_NAME = os.getenv("GOOGLE_CLOUD_BUCKET_NAME")
15390
15391with open(GOOGLE_APPLICATION_CREDENTIALS, "w") as f:
15392    f.write(get_secret("prod/gcp_service_user", region_name=AWS_S3_REGION))
15393
15394SENTRY_DSN = os.getenv("SENTRY_DSN")
15395IMPORT_SCORE = os.getenv("IMPORT_SCORE", "true").lower() == "true"
15396
15397
15398---
15399File: /validator-api/validator_api/dataset_upload.py
15400---
15401
15402from io import BytesIO
15403from typing import List
15404from datetime import datetime
15405import random
15406import tempfile
15407
15408from datasets import Dataset, Audio
15409from huggingface_hub import HfApi
15410import ulid
15411import soundfile as sf
15412import base64
15413
15414from omega.protocol import VideoMetadata, AudioMetadata
15415
15416from validator_api import config
15417
15418
15419HF_API = HfApi(token=config.HF_TOKEN)
15420NUM_BUCKETS = 1000
15421
15422
15423def get_data_path(batch_ulid_str: str) -> str:
15424    batch_ulid = ulid.from_str(batch_ulid_str)
15425    bucket = batch_ulid.int % NUM_BUCKETS
15426    return f"default/train/{bucket:03d}/{batch_ulid_str}.parquet"
15427
15428
15429def get_random_batch_size(batch_size: int) -> int:
15430    return random.choice([
15431        batch_size // 2,
15432        batch_size,
15433        batch_size * 2,
15434    ])
15435
15436def create_repo(name: str) -> None:
15437    try:
15438        HF_API.create_repo(
15439            repo_id=name,
15440            repo_type=config.REPO_TYPE,
15441            exist_ok=True,
15442            token=config.HF_TOKEN
15443        )
15444        print("Successfully created/verified repository")
15445    except Exception as e:
15446        print(f"Error creating repository: {e}")
15447
15448class DatasetUploader:
15449    def __init__(self):
15450        self.current_batch = []
15451        self.desired_batch_size = get_random_batch_size(config.UPLOAD_BATCH_SIZE)
15452        self.min_batch_size = 32
15453
15454    def add_videos(
15455        self, metadata: List[VideoMetadata], video_ids: List[str],
15456        description_relevance_scores: List[float], query_relevance_scores: List[float],
15457        query: str,
15458    ) -> None:
15459        curr_time = datetime.now()
15460        self.current_batch.extend([
15461            {
15462                "video_id": vid_uuid,
15463                "youtube_id": video.video_id,
15464                "description": video.description,
15465                "views": video.views,
15466                "start_time": video.start_time,
15467                "end_time": video.end_time,
15468                "video_embed": video.video_emb,
15469                "audio_embed": video.audio_emb,
15470                "description_embed": video.description_emb,
15471                "description_relevance_score": desc_score,
15472                "query_relevance_score": query_score,
15473                "query": query,
15474                "submitted_at": int(curr_time.timestamp()),
15475            }
15476            for vid_uuid, video, desc_score, query_score
15477            in zip(video_ids, metadata, description_relevance_scores, query_relevance_scores)
15478        ])
15479        print(f"Added {len(metadata)} videos to batch, now have {len(self.current_batch)}")
15480        if len(self.current_batch) >= self.desired_batch_size:
15481            self.submit()
15482
15483    def submit(self) -> None:
15484        if len(self.current_batch) < self.min_batch_size:
15485            print(f"Need at least {self.min_batch_size} videos to submit, but have {len(self.current_batch)}")
15486            return
15487        data = self.current_batch[:self.desired_batch_size]
15488        print(f"Uploading batch of {len(self.current_batch)} videos")
15489        with BytesIO() as f:
15490            dataset = Dataset.from_list(data)
15491            num_bytes = dataset.to_parquet(f)
15492            try:
15493                HF_API.upload_file(
15494                    path_or_fileobj=f,
15495                    path_in_repo=get_data_path(str(ulid.new())),
15496                    repo_id=config.HF_REPO,
15497                    repo_type=config.REPO_TYPE,
15498                    token=config.HF_TOKEN,
15499                )
15500                print(f"Uploaded {num_bytes} bytes to Hugging Face")
15501            except Exception as e:
15502                print(f"Error uploading to Hugging Face: {e}")
15503        self.current_batch = self.current_batch[self.desired_batch_size:]
15504        self.desired_batch_size = get_random_batch_size(config.UPLOAD_BATCH_SIZE)
15505
15506    
15507
15508class AudioDatasetUploader:
15509    def __init__(self):
15510        self.current_batch = []
15511        self.min_batch_size = 8
15512        self.desired_batch_size = get_random_batch_size(config.UPLOAD_AUDIO_BATCH_SIZE)
15513    
15514    def convert_audio_to_wav(self, audio_bytes: str) -> bytes:
15515        temp_audiofile = tempfile.NamedTemporaryFile(suffix=".wav")
15516        audio_bytes = base64.b64decode(audio_bytes)
15517        with open(temp_audiofile.name, "wb") as f:
15518            f.write(audio_bytes)
15519        return temp_audiofile.read()
15520
15521    def add_audios(
15522        self, metadata: List[AudioMetadata], audio_ids: List[str],
15523        inverse_der: float, audio_length_score: float,
15524        audio_quality_total_score: float, audio_query_score: float,
15525        query: str, total_score: float
15526    ) -> None:
15527        curr_time = datetime.now()
15528
15529        audio_files = [self.convert_audio_to_wav(audio.audio_bytes) for audio in metadata]
15530
15531
15532        
15533        self.current_batch.extend([
15534            {
15535                "audio_id": audio_uuid,
15536                "youtube_id": audio.video_id,
15537                # "audio_bytes": audio.audio_bytes,
15538                "audio": {"path": audio_file, "array": sf.read(BytesIO(base64.b64decode(audio.audio_bytes)))[0], "sampling_rate": 16000},
15539                "start_time": audio.start_time,
15540                "end_time": audio.end_time,
15541                "audio_embed": audio.audio_emb,
15542                "diar_timestamps_start": audio.diar_timestamps_start,
15543                "diar_timestamps_end": audio.diar_timestamps_end,
15544                "diar_speakers": audio.diar_speakers,
15545                "inverse_der": inverse_der,
15546                "audio_length_score": audio_length_score,
15547                "audio_quality_score": audio_quality_total_score,
15548                "query_relevance_score": audio_query_score,
15549                "total_score": total_score,
15550                "query": query,
15551                "submitted_at": int(curr_time.timestamp()),
15552            }
15553            for audio_uuid, audio_file, audio in zip(audio_ids, audio_files, metadata)
15554        ])
15555        print(f"Added {len(metadata)} audios to batch, now have {len(self.current_batch)}")
15556        if len(self.current_batch) >= self.desired_batch_size:
15557            self.submit()
15558
15559    def submit(self) -> None:
15560        if len(self.current_batch) < self.min_batch_size:
15561            print(f"Need at least {self.min_batch_size} audios to submit, but have {len(self.current_batch)}")
15562            return
15563        data = self.current_batch[:self.desired_batch_size]
15564        print(f"Uploading batch of {len(self.current_batch)} audios")
15565        with BytesIO() as f:
15566            dataset = Dataset.from_list(data)
15567            dataset = dataset.cast_column("audio", Audio())
15568            num_bytes = dataset.to_parquet(f)
15569            try:
15570                HF_API.upload_file(
15571                    path_or_fileobj=f,
15572                    path_in_repo=get_data_path(str(ulid.new())),
15573                    repo_id=config.HF_AUDIO_REPO,
15574                    repo_type=config.REPO_TYPE,
15575                    token=config.HF_TOKEN,
15576                )
15577                print(f"Uploaded {num_bytes} bytes to Hugging Face")
15578            except Exception as e:
15579                print(f"Error uploading to Hugging Face: {e}")
15580        self.current_batch = self.current_batch[self.desired_batch_size:]
15581        self.desired_batch_size = get_random_batch_size(config.UPLOAD_AUDIO_BATCH_SIZE)
15582
15583
15584
15585
15586audio_dataset_uploader = AudioDatasetUploader()
15587video_dataset_uploader = DatasetUploader()
15588
15589
15590
15591---
15592File: /validator-api/validator_api/imagebind_loader.py
15593---
15594
15595from typing import Optional
15596from fastapi import HTTPException
15597import asyncio
15598import threading
15599from concurrent.futures import ThreadPoolExecutor
15600from omega.imagebind_wrapper import ImageBind
15601
15602
15603class ImageBindLoader:
15604    def __init__(self):
15605        self._imagebind: Optional[ImageBind] = None
15606        self._loading_task: Optional[asyncio.Task] = None
15607        self._lock = asyncio.Lock()
15608        self._thread_pool = ThreadPoolExecutor(max_workers=1)
15609
15610    async def get_imagebind(self) -> ImageBind:
15611        """
15612        Asynchronously get or initialize ImageBind instance.
15613        Handles concurrent requests efficiently.
15614        """
15615        if self._imagebind is not None:
15616            return self._imagebind
15617
15618        if self._loading_task is None:
15619            self._loading_task = asyncio.create_task(self._load_imagebind_wrapper())
15620
15621        raise HTTPException(
15622            status_code=503,
15623            detail="ImageBind loading has started. Please try again later."
15624        )
15625
15626    def _load_imagebind_blocking(self) -> ImageBind:
15627        """Blocking method to load ImageBind in a separate thread."""
15628        return ImageBind(v2=True)
15629
15630    async def _load_imagebind_wrapper(self) -> None:
15631        """Wrapper to run the blocking load in a thread pool."""
15632        try:
15633            # Run the blocking operation in a thread pool
15634            loop = asyncio.get_running_loop()
15635            self._imagebind = await loop.run_in_executor(
15636                self._thread_pool,
15637                self._load_imagebind_blocking
15638            )
15639        finally:
15640            self._loading_task = None
15641
15642
15643
15644---
15645File: /validator-api/validator_api/limiter.py
15646---
15647
15648from slowapi import Limiter
15649from slowapi.util import get_remote_address
15650
15651limiter = Limiter(key_func=get_remote_address)
15652
15653
15654
15655---
15656File: /validator-api/validator_api/score.py
15657---
15658
15659import asyncio
15660import random
15661import uuid
15662from typing import List, Tuple, Optional, BinaryIO
15663import math
15664
15665from pinecone import Pinecone
15666import torch
15667import torch.nn.functional as F
15668import soundfile as sf
15669from io import BytesIO
15670
15671from omega.protocol import Videos, VideoMetadata, AudioMetadata, Audios
15672from omega import video_utils, unstuff
15673from omega.constants import (
15674    MAX_VIDEO_LENGTH, 
15675    MIN_VIDEO_LENGTH,
15676    DIFFERENCE_THRESHOLD, 
15677    SIMILARITY_THRESHOLD, 
15678    VIDEO_DOWNLOAD_TIMEOUT, 
15679    MIN_SCORE, 
15680    FAKE_VIDEO_PUNISHMENT,
15681    QUERY_RELEVANCE_SCALING_FACTOR,
15682    DESCRIPTION_RELEVANCE_SCALING_FACTOR,
15683    VIDEO_RELEVANCE_WEIGHT,
15684    DESCRIPTION_LENGTH_WEIGHT,
15685    MIN_LENGTH_BOOST_TOKEN_COUNT,
15686    MAX_LENGTH_BOOST_TOKEN_COUNT,
15687    STUFFED_DESCRIPTION_PUNISHMENT,
15688    DIARIZATION_SCALING_FACTOR,
15689    AUDIO_LENGTH_SCALING_FACTOR,
15690    AUDIO_QUALITY_SCALING_FACTOR,
15691    AUDIO_QUERY_RELEVANCE_SCALING_FACTOR,
15692    SPEECH_CONTENT_SCALING_FACTOR,
15693    SPEAKER_DOMINANCE_SCALING_FACTOR,
15694    BACKGROUND_NOISE_SCALING_FACTOR,
15695    MAX_AUDIO_LENGTH_SECONDS,
15696    MIN_AUDIO_LENGTH_SECONDS
15697)
15698from omega.imagebind_wrapper import ImageBind, Embeddings, run_async, LENGTH_TOKENIZER
15699from omega.text_similarity import get_text_similarity_score
15700from validator_api import config
15701from validator_api.dataset_upload import video_dataset_uploader, audio_dataset_uploader
15702from omega.audio_scoring import AudioScore
15703from omega.diarization_metric import calculate_diarization_metrics
15704
15705
15706
15707
15708PINECONE_INDEX = Pinecone(api_key=config.PINECONE_API_KEY).Index(config.PINECONE_INDEX)
15709PINECONE_AUDIO_INDEX = Pinecone(api_key=config.PINECONE_API_KEY).Index(config.PINECONE_AUDIO_INDEX)
15710GPU_SEMAPHORE = asyncio.Semaphore(1)
15711DOWNLOAD_SEMAPHORE = asyncio.Semaphore(5)
15712VIDEO_TYPE = "video"
15713AUDIO_TYPE = "audio"
15714DESCRIPTION_TYPE = "description"
15715
15716
15717async def query_pinecone(vector: List[float]) -> float:
15718    response = await run_async(
15719        PINECONE_INDEX.query,
15720        vector=vector,
15721        top_k=1,
15722        filter={
15723            "modality_type": {"$eq": VIDEO_TYPE},
15724        },
15725    )
15726    if len(response["matches"]) > 0:
15727        return 1 - response["matches"][0]["score"] 
15728    else:
15729        print("No pinecone matches, returning 0")
15730        return 0
15731
15732async def get_pinecone_novelty(metadata: List[VideoMetadata]) -> List[float]:
15733    """
15734    Take the top match from the Pinecone index.
15735    """
15736    novelty_scores = await asyncio.gather(*[
15737        query_pinecone(
15738            vector=mdata.video_emb
15739        )
15740        for mdata in metadata
15741    ])    
15742    return novelty_scores
15743
15744def compute_novelty_score_among_batch(emb: Embeddings) -> List[float]:
15745    video_tensor = emb.video
15746    num_videos = video_tensor.shape[0]
15747    novelty_scores = []
15748    for i in range(num_videos - 1):
15749        similarity_score = F.cosine_similarity(video_tensor[[i]], video_tensor[i + 1:]).max()
15750        novelty_scores.append(1 - similarity_score.item())
15751    novelty_scores.append(1.0)  # last video is 100% novel
15752    return novelty_scores
15753
15754async def async_zero() -> None:
15755    return 0
15756
15757async def compute_novelty_score(embeddings: Embeddings) -> Tuple[float, List[bool]]:
15758    local_novelty_scores = compute_novelty_score_among_batch(embeddings)
15759    global_novelty_scores = await asyncio.gather(*[
15760        async_zero() if local_score < DIFFERENCE_THRESHOLD else  # don't even query Pinecone if it's already too similar
15761        query_pinecone(vector=embedding.tolist())
15762        for embedding, local_score in zip(embeddings.video, local_novelty_scores)
15763    ])
15764    true_novelty_scores = [
15765        min(local_score, global_score) for local_score, global_score
15766        in zip(local_novelty_scores, global_novelty_scores)
15767    ]
15768    is_too_similar = [score < DIFFERENCE_THRESHOLD for score in true_novelty_scores]
15769    novelty_score = sum([
15770        score for score, is_too_similar
15771        in zip(true_novelty_scores, is_too_similar)
15772        if not is_too_similar
15773    ])
15774    return novelty_score, is_too_similar
15775
15776
15777def upload_to_pinecone(embeddings: Embeddings, metadata: List[VideoMetadata]) -> None:
15778    video_ids = [str(uuid.uuid4()) for _ in range(len(metadata))]
15779    try:
15780        PINECONE_INDEX.upsert(
15781            vectors=sum([
15782                [
15783                    {
15784                        "id": f"{modality_type[:3]}{video_uuid}",
15785                        "values": emb.tolist(),
15786                        "metadata": {
15787                            "youtube_id": video.video_id,
15788                            "modality_type": modality_type,
15789                        }
15790                    }
15791                    for emb, modality_type
15792                    in zip(
15793                        [embedding_vid, embedding_aud, embedding_des],
15794                        [VIDEO_TYPE, AUDIO_TYPE, DESCRIPTION_TYPE]
15795                    )
15796                ]
15797                for video_uuid, video, embedding_vid, embedding_aud, embedding_des
15798                in zip(video_ids, metadata, embeddings.video, embeddings.audio, embeddings.description)
15799            ], []),
15800        )
15801    except Exception as e:
15802        print(f"Failed to upload to Pinecone: {e}")
15803    return video_ids
15804
15805
15806def upload_to_pinecone_audio(embeddings: Embeddings, metadata: List[AudioMetadata]) -> None:
15807    audio_ids = [str(uuid.uuid4()) for _ in range(len(metadata))]
15808    try:
15809        PINECONE_AUDIO_INDEX.upsert(
15810            vectors=[
15811                {
15812                    "id": f"{audio_uuid}",
15813                    "values": embedding_aud.tolist(),
15814                    "metadata": {
15815                        "youtube_id": audio.video_id,
15816                    }
15817                }
15818                for audio_uuid, audio, embedding_aud
15819                in zip(audio_ids, metadata, embeddings.audio)
15820            ],
15821        )
15822    except Exception as e:
15823        print(f"Failed to upload to Pinecone: {e}")
15824    return audio_ids
15825
15826async def upload_video_metadata(
15827    metadata: List[VideoMetadata], 
15828    description_relevance_scores: List[float], 
15829    query_relevance_scores: List[float], 
15830    query: str, 
15831) -> None:
15832    # generate embeddings from our metadata
15833    embeddings = Embeddings(
15834        video=torch.stack([torch.tensor(v.video_emb) for v in metadata]),
15835        audio=torch.stack([torch.tensor(v.audio_emb) for v in metadata]),
15836        description=torch.stack([torch.tensor(v.description_emb) for v in metadata]),
15837    )
15838    # upload embeddings and metadata to pinecone
15839    video_ids = await run_async(upload_to_pinecone, embeddings, metadata)
15840    # Schedule upload to HuggingFace
15841    video_dataset_uploader.add_videos(
15842        metadata,
15843        video_ids,
15844        description_relevance_scores,
15845        query_relevance_scores,
15846        query,
15847    )
15848    return video_ids
15849
15850async def upload_audio_metadata(
15851    metadata: List[AudioMetadata], 
15852    inverse_der: float, audio_length_score: float,
15853    audio_quality_total_score: float, 
15854    audio_query_score: float,
15855    query: str, 
15856    total_score: float 
15857) -> None:
15858    embeddings = Embeddings(
15859        video=None,
15860        audio=torch.stack([torch.tensor(v.audio_emb) for v in metadata]),
15861        description=None,
15862    )
15863    audio_ids = await run_async(upload_to_pinecone_audio, embeddings, metadata)
15864    audio_ids = [str(uuid.uuid4()) for _ in range(len(metadata))]
15865    audio_dataset_uploader.add_audios(
15866        metadata,
15867        audio_ids,
15868        inverse_der,
15869        audio_length_score,
15870        audio_quality_total_score,
15871        audio_query_score,
15872        query,
15873        total_score
15874    )
15875    return audio_ids
15876
15877
15878def filter_embeddings(embeddings: Embeddings, is_too_similar: List[bool]) -> Embeddings:
15879    """Filter the embeddings based on whether they are too similar to the query."""
15880    is_too_similar = torch.tensor(is_too_similar)
15881    if embeddings.video is not None:
15882        embeddings.video = embeddings.video[~is_too_similar]
15883    if embeddings.audio is not None:
15884        embeddings.audio = embeddings.audio[~is_too_similar]
15885    if embeddings.description is not None:
15886        embeddings.description = embeddings.description[~is_too_similar]
15887    return embeddings
15888
15889
15890def filter_stuffed_embeddings(embeddings: Embeddings, stuffed: List[Tuple[bool, float]]) -> Embeddings:
15891    """Filter the embeddings based on whether they are too similar to the query."""
15892    stuffed = torch.tensor([s for s, _ in stuffed])
15893    if embeddings.video is not None:
15894        embeddings.video = embeddings.video[~stuffed]
15895    if embeddings.audio is not None:
15896        embeddings.audio = embeddings.audio[~stuffed]
15897    if embeddings.description is not None:
15898        embeddings.description = embeddings.description[~stuffed]
15899    return embeddings
15900
15901def is_similar(emb_1: torch.Tensor, emb_2: List[float]) -> bool:
15902    return F.cosine_similarity(
15903        emb_1,
15904        torch.tensor(emb_2, device=emb_1.device).unsqueeze(0)
15905    ) > SIMILARITY_THRESHOLD
15906
15907
15908def strict_is_similar(emb_1: torch.Tensor, emb_2: List[float]) -> bool:
15909    return torch.allclose(emb_1, torch.tensor(emb_2, device=emb_1.device), atol=1e-4)
15910
15911
15912def metadata_check(metadata: List[VideoMetadata]) -> List[VideoMetadata]:
15913    return [
15914        video_metadata for video_metadata in metadata
15915        if (
15916            video_metadata.end_time - video_metadata.start_time <= MAX_VIDEO_LENGTH and
15917            video_metadata.end_time - video_metadata.start_time >= MIN_VIDEO_LENGTH
15918        )
15919    ]
15920
15921
15922def audio_metadata_check(metadata: List[AudioMetadata]) -> List[AudioMetadata]:
15923    return [
15924        audio_metadata for audio_metadata in metadata
15925        if (
15926            audio_metadata.end_time - audio_metadata.start_time <= MAX_VIDEO_LENGTH and
15927            audio_metadata.end_time - audio_metadata.start_time >= MIN_VIDEO_LENGTH
15928        )
15929    ]
15930
15931def deduplicate_audios(embeddings: Embeddings) -> List[bool]:
15932    # return a list of booleans where True means the corresponding video is a duplicate i.e. is_similar
15933    audio_tensor = embeddings.audio
15934    num_audios = audio_tensor.shape[0]
15935    # cossim = CosineSimilarity(dim=1)
15936    is_similar = []
15937    for i in range(num_audios):
15938        similarity_score = F.cosine_similarity(audio_tensor[[i]], audio_tensor[i + 1:]).max()
15939        has_duplicates = (similarity_score > SIMILARITY_THRESHOLD).any()
15940        is_similar.append(has_duplicates.item())
15941        
15942    return is_similar
15943
15944def compute_novelty_score_among_batch_audio(emb: Embeddings) -> List[float]:
15945    audio_tensor = emb.audio
15946    num_audios = audio_tensor.shape[0]
15947    novelty_scores = []
15948    for i in range(num_audios - 1):
15949        similarity_score = F.cosine_similarity(audio_tensor[[i]], audio_tensor[i + 1:]).max()
15950        novelty_scores.append(1 - similarity_score.item())
15951    novelty_scores.append(1.0)  # last video is 100% novel
15952    return novelty_scores
15953
15954def get_proxy_url() -> str:
15955    return random.choice(config.PROXY_LIST + [None])
15956
15957
15958async def get_random_video(metadata: List[VideoMetadata], check_video: bool) -> Optional[Tuple[VideoMetadata, Optional[BinaryIO]]]:
15959    if not check_video:
15960        random_metadata = random.choice(metadata)
15961        return random_metadata, None
15962
15963    random_video = None
15964    metadata_copy = [v for v in metadata]  # list shallow copy
15965    while random_video is None and len(metadata_copy) > 0:
15966        idx = random.randint(0, len(metadata_copy) - 1)
15967        random_metadata = metadata_copy.pop(idx)
15968        try:
15969            async with DOWNLOAD_SEMAPHORE:
15970                random_video = await asyncio.wait_for(run_async(
15971                    video_utils.download_youtube_video,
15972                    random_metadata.video_id,
15973                    random_metadata.start_time,
15974                    random_metadata.end_time,
15975                    proxy=get_proxy_url(),
15976                ), timeout=VIDEO_DOWNLOAD_TIMEOUT)
15977        except video_utils.IPBlockedException:
15978            # IP is blocked, cannot download video, check description only
15979            print("WARNING: IP is blocked, cannot download video, checking description only")
15980            return random_metadata, None
15981        except video_utils.FakeVideoException:
15982            print(f"WARNING: Video {random_metadata.video_id} is fake, punishing miner")
15983            return None
15984        except asyncio.TimeoutError:
15985            continue
15986
15987    # IP is not blocked, video is not fake, but video download failed for some reason. We don't
15988    # know why it failed so we won't punish the miner, but we will check the description only.
15989    if random_video is None:
15990        return random_metadata, None
15991
15992    return random_metadata, random_video
15993
15994
15995async def random_check(random_meta_and_vid: List[VideoMetadata], imagebind: ImageBind) -> bool:
15996    random_metadata, random_video = random_meta_and_vid
15997
15998    if random_video is None:
15999        desc_embeddings = await imagebind.embed_text_async([random_metadata.description])
16000        is_similar_ = is_similar(desc_embeddings, random_metadata.description_emb)
16001        strict_is_similar_ = strict_is_similar(desc_embeddings, random_metadata.description_emb)
16002        print(f"Description similarity: {is_similar_}, strict description similarity: {strict_is_similar_}")
16003        return is_similar_
16004
16005    # Video downloaded, check all embeddings
16006    embeddings = await imagebind.embed_async([random_metadata.description], [random_video])
16007    is_similar_ = (
16008        is_similar(embeddings.video, random_metadata.video_emb) and
16009        is_similar(embeddings.audio, random_metadata.audio_emb) and
16010        is_similar(embeddings.description, random_metadata.description_emb)
16011    )
16012    strict_is_similar_ = (
16013        strict_is_similar(embeddings.video, random_metadata.video_emb) and
16014        strict_is_similar(embeddings.audio, random_metadata.audio_emb) and
16015        strict_is_similar(embeddings.description, random_metadata.description_emb)
16016    )
16017    print(f"Total similarity: {is_similar_}, strict total similarity: {strict_is_similar_}")
16018    return is_similar_
16019
16020
16021async def get_num_unique_videos(videos: Videos) -> int:
16022    metadata = videos.video_metadata
16023    embeddings = Embeddings(
16024        video=torch.stack([torch.tensor(v.video_emb) for v in metadata]),
16025        audio=torch.stack([torch.tensor(v.audio_emb) for v in metadata]),
16026        description=torch.stack([torch.tensor(v.description_emb) for v in metadata]),
16027    )
16028    novelty_score, is_too_similar = await compute_novelty_score(embeddings)
16029    return sum([not is_sim for is_sim in is_too_similar])
16030
16031
16032async def _run_video_scoring(videos: Videos, imagebind: ImageBind, is_check_only: bool) -> float:
16033    
16034    # check video_ids for fake videos
16035    if any(not video_utils.is_valid_youtube_id(video.video_id) for video in videos.video_metadata):
16036        return {"score": FAKE_VIDEO_PUNISHMENT}
16037    
16038    metadata = metadata_check(videos.video_metadata)[:videos.num_videos]
16039    print(f"Filtered {len(videos.video_metadata)} videos down to {len(metadata)} videos")
16040
16041    # return minimum score if no videos were found in video_metadata
16042    if len(metadata) == 0:
16043        return {"score": MIN_SCORE}
16044
16045    check_video = config.CHECK_PROBABILITY > random.random()
16046    random_meta_and_vid = await get_random_video(metadata, check_video)
16047    if random_meta_and_vid is None:
16048        return {"score": FAKE_VIDEO_PUNISHMENT}
16049
16050    async with GPU_SEMAPHORE:
16051        passed_check = await random_check(random_meta_and_vid, imagebind)
16052        if not passed_check:
16053            return {"score": FAKE_VIDEO_PUNISHMENT}
16054
16055        query_emb = await imagebind.embed_text_async([videos.query])
16056
16057    # Upload the videos to Pinecone and deduplicate
16058    original_length = len(metadata)
16059    embeddings = Embeddings(
16060        video=torch.stack([torch.tensor(v.video_emb) for v in metadata]).to(imagebind.device),
16061        audio=torch.stack([torch.tensor(v.audio_emb) for v in metadata]).to(imagebind.device),
16062        description=torch.stack([torch.tensor(v.description_emb) for v in metadata]).to(imagebind.device),
16063    )
16064    novelty_score, is_too_similar = await compute_novelty_score(embeddings)
16065    embeddings = filter_embeddings(embeddings, is_too_similar)
16066    metadata = [metadata for metadata, too_similar in zip(metadata, is_too_similar) if not too_similar]
16067    print(f"Deduplicated {original_length} videos down to {len(metadata)} videos")
16068
16069    # Filter out "stuffed" descriptions.
16070    pre_filter_metadata_length = len(metadata)
16071    stuffed = [
16072        unstuff.is_stuffed(meta.description)
16073        for meta in metadata
16074    ]
16075    if any([garbage and confidence > 0.75 for garbage, confidence in stuffed]):
16076        print("Stuffed description found with high confidence, penalizing the miner.")
16077        return {"score": STUFFED_DESCRIPTION_PUNISHMENT}
16078    
16079    # More stuffing.
16080    extraneous = [
16081        unstuff.check_extraneous_chunks(meta.description, meta.video_emb, meta.audio_emb, imagebind)
16082        for meta in metadata
16083    ]
16084    for really_bad, low_quality, total in extraneous:
16085        if really_bad > 5 or low_quality >= 16:
16086            print(f"Extraneous garbage found in text check {really_bad=} {low_quality=} {total=}")
16087            return {"score": STUFFED_DESCRIPTION_PUNISHMENT}
16088
16089    metadata = [
16090        metadata[idx]
16091        for idx in range(len(metadata))
16092        if not stuffed[idx][0]
16093        and extraneous[idx][1] <= 15
16094        and extraneous[idx][2] <= 50
16095    ]
16096    if len(metadata) < pre_filter_metadata_length:
16097        print(f"Filtering {pre_filter_metadata_length} videos down to {len(metadata)} videos to remove token-stuffed descriptions.")
16098    if len(metadata) == 0:
16099        return {"score": MIN_SCORE}
16100
16101    embeddings = filter_stuffed_embeddings(embeddings, stuffed)
16102
16103    # Compute relevance scores
16104    video_description_relevance_scores = F.cosine_similarity(
16105        embeddings.video, embeddings.description
16106    ).tolist()
16107    audio_description_relevance_scores = F.cosine_similarity(
16108        embeddings.audio, embeddings.description
16109    ).tolist()
16110    video_query_relevance_scores = F.cosine_similarity(
16111        embeddings.video, query_emb
16112    ).tolist()
16113    audio_query_relevance_scores = F.cosine_similarity(
16114        embeddings.audio, query_emb
16115    ).tolist()
16116
16117    # Query relevance score now includes video cosim, audio cosim, and text cosim using higher quality text-only model.
16118    query_relevance_scores = [
16119        sum([
16120            video_query_relevance_scores[idx],
16121            audio_query_relevance_scores[idx],
16122            get_text_similarity_score(metadata[idx].description, videos.query),
16123        ]) / 3
16124        for idx in range(len(video_query_relevance_scores))
16125    ]
16126
16127    # Combine audio & visual description scores, weighted towards visual.
16128    description_relevance_scores = [
16129        sum([
16130            video_description_relevance_scores[idx] * VIDEO_RELEVANCE_WEIGHT,
16131            audio_description_relevance_scores[idx] * (1.0 - VIDEO_RELEVANCE_WEIGHT),
16132        ])
16133        for idx in range(len(video_description_relevance_scores))
16134    ]
16135
16136    # Scale description scores by number of unique tokens.
16137    length_scalers = []
16138    for idx in range(len(description_relevance_scores)):
16139        unique_tokens = LENGTH_TOKENIZER(metadata[idx].description)
16140        unique_tokens = set(unique_tokens[unique_tokens != 0][1:-1].tolist())
16141        unique_token_count = len(unique_tokens)
16142        if unique_token_count <= MIN_LENGTH_BOOST_TOKEN_COUNT:
16143            print(f"Very few tokens, applying {DESCRIPTION_LENGTH_WEIGHT} penalty.")
16144            description_relevance_scores[idx] *= (1.0 - DESCRIPTION_LENGTH_WEIGHT)
16145            length_scalers.append(0)
16146            continue
16147        length_scaler = min(math.log(MAX_LENGTH_BOOST_TOKEN_COUNT, 2), math.log(unique_token_count, 2)) - math.log(MIN_LENGTH_BOOST_TOKEN_COUNT, 2)
16148        length_scaler /= (math.log(MAX_LENGTH_BOOST_TOKEN_COUNT, 2) - math.log(MIN_LENGTH_BOOST_TOKEN_COUNT, 2))
16149        length_scalers.append(length_scaler)
16150        print(f"Description length scaling factor = {length_scaler}")
16151        description_relevance_scores[idx] -= description_relevance_scores[idx] * DESCRIPTION_LENGTH_WEIGHT * (1.0 - length_scaler)
16152
16153    # Aggregate scores
16154    score = (
16155        (sum(description_relevance_scores) * DESCRIPTION_RELEVANCE_SCALING_FACTOR) +
16156        (sum(query_relevance_scores) * QUERY_RELEVANCE_SCALING_FACTOR)
16157    ) / 2 / videos.num_videos
16158
16159    print(f'''
16160        is_unique: {[not is_sim for is_sim in is_too_similar]},
16161        video cosine sim: {video_description_relevance_scores},
16162        audio cosine sim: {audio_description_relevance_scores},
16163        description relevance scores: {description_relevance_scores},
16164        query relevance scores: {query_relevance_scores},
16165        length scalers: {length_scalers},
16166        total score: {score}
16167    ''')
16168
16169    if not is_check_only and len(metadata) > 0:
16170        video_ids = await run_async(upload_to_pinecone, embeddings, metadata)
16171        # Schedule upload to HuggingFace
16172        video_dataset_uploader.add_videos(
16173            metadata,
16174            video_ids,
16175            description_relevance_scores,
16176            query_relevance_scores,
16177            videos.query,
16178        )
16179    score = max(score, MIN_SCORE)
16180
16181    if score > 0.4:
16182        print(f"Videos with score > 0.4: {metadata}")
16183
16184    return {
16185        "is_unique": [not is_sim for is_sim in is_too_similar],
16186        "description_relevance_scores": description_relevance_scores,
16187        "query_relevance_scores": query_relevance_scores,
16188        "score": score,
16189    }
16190
16191
16192async def _run_audio_scoring(audios: Audios, imagebind: ImageBind, is_check_only: bool = False) -> float:
16193    """Score audio submissions and optionally upload them.
16194    
16195    Args:
16196        audios: The audio submissions to score
16197        imagebind: ImageBind model for embeddings
16198        is_check_only: If True, only score without uploading
16199        
16200    Returns:
16201        Either the final score (float) or a dict with detailed scoring info
16202    """
16203    if len(audios.audio_metadata) == 0:
16204        return MIN_SCORE
16205
16206    # Check for valid YouTube IDs
16207    if any(not video_utils.is_valid_youtube_id(audio.video_id) for audio in audios.audio_metadata):
16208        return FAKE_VIDEO_PUNISHMENT
16209    
16210
16211    # Check audio metadata and filter out invalid ones
16212    metadata = audio_metadata_check(audios.audio_metadata)[:audios.num_audios]
16213    print(f"Filtered {len(audios.audio_metadata)} audios down to {len(metadata)} audios")
16214    
16215
16216    # execute the random check on metadata and video
16217    async with GPU_SEMAPHORE:
16218        query_emb = await imagebind.embed_text_async([audios.query])
16219    
16220    embeddings = Embeddings(
16221        video=None,
16222        audio=torch.stack([torch.tensor(a.audio_emb) for a in metadata]).to(imagebind.device),
16223        description=None
16224    )
16225
16226    # check and deduplicate videos based on embedding similarity checks. We do this because we're not uploading to pinecone first.
16227    metadata_is_similar = await deduplicate_audios(embeddings)
16228    metadata = [metadata for metadata, too_similar in zip(metadata, metadata_is_similar) if not too_similar]
16229    embeddings = filter_embeddings(embeddings, metadata_is_similar)
16230    
16231    if len(metadata) < len(audios.audio_metadata):
16232        print(f"Deduplicated {len(audios.audio_metadata)} audios down to {len(metadata)} audios")
16233    
16234    if len(metadata) == 0:
16235        return MIN_SCORE
16236        
16237    # first get local novelty scores
16238    local_novelty_scores = compute_novelty_score_among_batch_audio(embeddings)
16239    pre_filter_metadata_length = len(metadata)
16240    # check scores from index for being too similar
16241    is_too_similar = [score < DIFFERENCE_THRESHOLD for score in local_novelty_scores]
16242    # filter out metadata too similar
16243    metadata = [metadata for metadata, too_similar in zip(metadata, is_too_similar) if not too_similar]
16244    # filter out embeddings too similar
16245    embeddings = filter_embeddings(embeddings, is_too_similar)
16246    if len(metadata) < pre_filter_metadata_length:
16247        print(f"Filtering {pre_filter_metadata_length} audios down to {len(metadata)} audios that are too similar to audios in our index.")
16248
16249    # return minimum score if no unique videos were found
16250    if len(metadata) == 0:
16251        return MIN_SCORE
16252
16253    # Filter metadata based on length constraints
16254    metadata = [
16255        meta for meta in audios.audio_metadata[:audios.num_audios]
16256        if (meta.end_time - meta.start_time) >= MIN_AUDIO_LENGTH_SECONDS 
16257        and (meta.end_time - meta.start_time) <= MAX_AUDIO_LENGTH_SECONDS
16258    ]
16259
16260    if len(metadata) == 0:
16261        return MIN_SCORE
16262    
16263    total_audio_length = sum((meta.end_time - meta.start_time) for meta in metadata) 
16264    print(f"Average audio length: {total_audio_length/len(metadata):.2f} seconds")
16265    audio_length_score = total_audio_length/(audios.num_audios*MAX_AUDIO_LENGTH_SECONDS)
16266
16267    audio_query_score = sum(F.cosine_similarity(
16268        embeddings.audio, query_emb
16269    ).tolist())/len(metadata)
16270    print(f"Audio query score: {audio_query_score}")
16271
16272    # Randomly sample one audio for duration check
16273    selected_random_meta = random.choice(metadata)
16274    audio_array, sr = sf.read(BytesIO(selected_random_meta.audio_bytes))
16275    audio_duration = len(audio_array) / sr
16276    print(f"Selected Youtube Video: {selected_random_meta.video_id}, Duration: {audio_duration:.2f} seconds")
16277
16278    audio_quality_scores = AudioScore().total_score(
16279        audio_array,
16280        sr,
16281        selected_random_meta.diar_timestamps_start,
16282        selected_random_meta.diar_timestamps_end,
16283        selected_random_meta.diar_speakers
16284    )
16285    audio_quality_total_score = (
16286        audio_quality_scores["speech_content_score"] * SPEECH_CONTENT_SCALING_FACTOR +
16287        audio_quality_scores["speaker_dominance_score"] * SPEAKER_DOMINANCE_SCALING_FACTOR +
16288        audio_quality_scores["background_noise_score"] * BACKGROUND_NOISE_SCALING_FACTOR
16289    )
16290
16291    miner_diar_segment = {
16292                "start": selected_random_meta.diar_timestamps_start,
16293                "end": selected_random_meta.diar_timestamps_end,
16294                "speakers": selected_random_meta.diar_speakers
16295            }
16296  
16297    diarization_score = calculate_diarization_metrics(
16298        audio_array,
16299        sr,
16300        miner_diar_segment
16301    )
16302    inverse_der = diarization_score["inverse_der"]
16303    total_score = (
16304        DIARIZATION_SCALING_FACTOR * inverse_der +
16305        AUDIO_LENGTH_SCALING_FACTOR * audio_length_score +
16306        AUDIO_QUALITY_SCALING_FACTOR * audio_quality_total_score +
16307        AUDIO_QUERY_RELEVANCE_SCALING_FACTOR * audio_query_score
16308    )
16309
16310    print(f'''
16311        is_unique: {[not is_sim for is_sim in is_too_similar]},
16312        audio_query_score: {audio_query_score},
16313        audio_length_score: {audio_length_score}, 
16314        audio_quality_score: {audio_quality_total_score},
16315        diarization_score: {inverse_der},
16316        total score: {total_score}
16317    ''')
16318    
16319    if not is_check_only and len(metadata) > 0:
16320        # Upload metadata and schedule dataset upload
16321        audio_ids = await run_async(upload_to_pinecone_audio, embeddings, metadata)
16322
16323        audio_dataset_uploader.add_audios(
16324            metadata,
16325            audio_ids,
16326            inverse_der,
16327            audio_length_score,
16328            audio_quality_total_score,
16329            audio_query_score,
16330            audios.query,
16331            total_score,
16332        )
16333    total_score = max(total_score, MIN_SCORE)
16334
16335    if total_score > 0.4:
16336        print(f"Audios with score > 0.4: {metadata}")
16337
16338    return {
16339        "is_unique": [not is_sim for is_sim in is_too_similar],
16340        "audio_query_score": audio_query_score,
16341        "audio_length_score": audio_length_score,
16342        "audio_quality_score": audio_quality_total_score,
16343        "diarization_score": inverse_der,
16344        "score": total_score
16345    }
16346
16347
16348async def score_videos_for_testing(videos: Videos, imagebind: ImageBind) -> float:
16349    return await _run_video_scoring(videos, imagebind, is_check_only=True)
16350
16351
16352async def score_and_upload_videos(videos: Videos, imagebind: ImageBind) -> float:
16353    scores_dict = await _run_video_scoring(videos, imagebind, is_check_only=False)
16354    return scores_dict["score"]
16355
16356
16357async def score_audios_for_testing(audios: Audios, imagebind: ImageBind) -> float:
16358    return await _run_audio_scoring(audios, imagebind, is_check_only=True)
16359
16360
16361async def score_and_upload_audios(audios: Audios, imagebind: ImageBind) -> float:
16362    scores_dict = await _run_audio_scoring(audios, imagebind, is_check_only=False)
16363    return scores_dict["score"]
16364
16365
16366---
16367File: /validator-api/_generate_api_key.py
16368---
16369
16370import secrets
16371
16372def generate_api_key():
16373    return secrets.token_urlsafe(32)  # Generates a 32-byte (256-bit) key
16374
16375new_api_key = generate_api_key()
16376print(new_api_key)
16377
16378
16379---
16380File: /validator-api/app.py
16381---
16382
16383import asyncio
16384import requests
16385import os
16386import json
16387from datetime import datetime
16388import time
16389from typing import Annotated, List, Optional, Dict, Any
16390import random
16391import json
16392from pydantic import BaseModel
16393import traceback
16394from threading import Lock
16395
16396from tempfile import TemporaryDirectory
16397import huggingface_hub
16398from datasets import load_dataset
16399import ulid
16400
16401from traceback import print_exception
16402
16403import bittensor
16404import uvicorn
16405from fastapi import FastAPI, HTTPException, Depends, Body, Path, Security, BackgroundTasks, Request
16406from fastapi.security import HTTPBasicCredentials, HTTPBasic
16407from fastapi.security.api_key import APIKeyHeader
16408from fastapi.staticfiles import StaticFiles
16409from fastapi.responses import FileResponse
16410from starlette import status
16411from substrateinterface import Keypair
16412
16413import sentry_sdk
16414
16415from sqlalchemy.orm import Session
16416from validator_api.database import get_db, get_db_context
16417from validator_api.database.crud.focusvideo import (
16418    get_all_available_focus, check_availability, get_video_owner_coldkey,
16419    already_purchased_max_focus_tao, get_miner_purchase_stats, MinerPurchaseStats,
16420    set_focus_video_score, mark_video_rejected, mark_video_submitted, TaskType
16421)
16422from validator_api.utils.marketplace import get_max_focus_tao, TASK_TYPE_MAP, get_purchase_max_focus_tao
16423from validator_api.cron.confirm_purchase import confirm_transfer, confirm_video_purchased
16424from validator_api.services.scoring_service import FocusScoringService, VideoUniquenessError
16425
16426from validator_api.communex.client import CommuneClient
16427from validator_api.communex._common import get_node_url
16428
16429from omega.protocol import Videos, VideoMetadata, AudioMetadata
16430from validator_api.imagebind_loader import ImageBindLoader
16431
16432from validator_api.config import (
16433    NETWORK, NETUID, 
16434    ENABLE_COMMUNE, COMMUNE_NETWORK, COMMUNE_NETUID,
16435    API_KEY_NAME, API_KEYS, DB_CONFIG,
16436    TOPICS_LIST, PROXY_LIST, IS_PROD, 
16437    FOCUS_REWARDS_PERCENT, FOCUS_API_KEYS,
16438    SENTRY_DSN, IMPORT_SCORE
16439)
16440
16441print("IMPORT_SCORE:", IMPORT_SCORE)
16442
16443if IMPORT_SCORE is not False:
16444    from validator_api import score
16445else:
16446    # remove cuda error on mac
16447    score = None
16448
16449from validator_api.dataset_upload import video_dataset_uploader, audio_dataset_uploader
16450from validator_api.limiter import limiter
16451
16452
16453### Constants for OMEGA Metadata Dashboard ###
16454HF_DATASET = "omegalabsinc/omega-multimodal"
16455DATA_FILES_PREFIX = "default/train/"
16456MAX_FILES = 1
16457CACHE_FILE = "desc_embeddings_recent.json"
16458MIN_AGE = 60 * 60 * 48  # 2 days in seconds
16459
16460import mysql.connector
16461
16462def connect_to_db():
16463    try:
16464        connection = mysql.connector.connect(**DB_CONFIG)
16465        return connection
16466    except mysql.connector.Error as err:
16467        print("Error in connect_to_db while creating MySQL database connection:", err)
16468
16469# define the APIKeyHeader for API authorization to our multi-modal endpoints
16470api_key_header = APIKeyHeader(name=API_KEY_NAME, auto_error=False)
16471focus_api_key_header = APIKeyHeader(name="FOCUS_API_KEY", auto_error=False)
16472
16473security = HTTPBasic()
16474imagebind_loader = ImageBindLoader()
16475
16476focus_scoring_service = FocusScoringService()
16477
16478print("SENTRY_DSN:", SENTRY_DSN)
16479sentry_sdk.init(
16480    dsn=SENTRY_DSN,
16481    traces_sample_rate=1.0,
16482    profiles_sample_rate=1.0,
16483)
16484
16485# region Utility functions for OMEGA Metadata Dashboard
16486def get_timestamp_from_filename(filename: str):
16487    return ulid.from_str(os.path.splitext(filename.split("/")[-1])[0]).timestamp().timestamp
16488
16489def pull_and_cache_dataset() -> List[str]:
16490    # Get the list of files in the dataset repository
16491    omega_ds_files = huggingface_hub.repo_info(repo_id=HF_DATASET, repo_type="dataset").siblings
16492    
16493    # Filter files that match the DATA_FILES_PREFIX
16494    recent_files = [
16495        f.rfilename
16496        for f in omega_ds_files if
16497        f.rfilename.startswith(DATA_FILES_PREFIX) and 
16498        time.time() - get_timestamp_from_filename(f.rfilename) < MIN_AGE
16499    ][:MAX_FILES]
16500    
16501    # Randomly sample up to MAX_FILES from the matching files
16502    sampled_files = random.sample(recent_files, min(MAX_FILES, len(recent_files)))
16503    
16504    # Load the dataset using the sampled files
16505    video_metadata = []
16506    with TemporaryDirectory() as temp_dir:
16507        omega_dataset = load_dataset(HF_DATASET, data_files=sampled_files, cache_dir=temp_dir)["train"]
16508        for i, entry in enumerate(omega_dataset):
16509            metadata = []
16510            if "description" in entry and "description_embed" in entry:
16511                metadata.append(entry["video_id"])
16512                metadata.append(entry["youtube_id"])
16513                metadata.append(entry["start_time"])
16514                metadata.append(entry["end_time"])
16515                metadata.append(entry["description"])
16516                metadata.append(entry["description_relevance_score"])
16517                metadata.append(entry["query_relevance_score"])
16518                metadata.append(entry["query"])
16519                metadata.append(entry["submitted_at"])
16520                video_metadata.append(metadata)
16521    
16522    # Cache the descriptions to a local file
16523    with open(CACHE_FILE, "w") as f:
16524        json.dump(video_metadata, f)
16525    
16526    return True
16527# endregion Utility functions for OMEGA Metadata Dashboard
16528
16529async def get_api_key(api_key_header: str = Security(api_key_header)):
16530    if api_key_header in API_KEYS:
16531        return api_key_header
16532    else:
16533        raise HTTPException(
16534            status_code=401,
16535            detail="Invalid API Key"
16536        )
16537    
16538async def get_focus_api_key(focus_api_key_header: str = Security(focus_api_key_header)):
16539    if focus_api_key_header in FOCUS_API_KEYS:
16540        return focus_api_key_header
16541    else:
16542        raise HTTPException(
16543            status_code=401,
16544            detail="Invalid API Key"
16545        )
16546
16547class VideoMetadataUpload(BaseModel):
16548    metadata: List[VideoMetadata]
16549    description_relevance_scores: List[float]
16550    query_relevance_scores: List[float]
16551    topic_query: str
16552    novelty_score: Optional[float] = None
16553    total_score: Optional[float] = None
16554    miner_hotkey: Optional[str] = None
16555
16556class AudioMetadataUpload(BaseModel):
16557    metadata: List[AudioMetadata]
16558    inverse_der: float
16559    audio_length_score: float
16560    audio_quality_total_score: float
16561    audio_query_score: float
16562    topic_query: str
16563    total_score: Optional[float] = None
16564    miner_hotkey: Optional[str] = None
16565
16566class FocusScoreResponse(BaseModel):
16567    video_id: str
16568    video_score: float
16569    video_details: dict
16570
16571class VideoPurchaseRevert(BaseModel):
16572    video_id: str
16573
16574def get_hotkey(credentials: Annotated[HTTPBasicCredentials, Depends(security)]) -> str:
16575    keypair = Keypair(ss58_address=credentials.username)
16576
16577    if keypair.verify(credentials.username, credentials.password):
16578        return credentials.username
16579
16580    raise HTTPException(
16581        status_code=status.HTTP_401_UNAUTHORIZED,
16582        detail="Signature mismatch",
16583    )
16584
16585def check_commune_validator_hotkey(hotkey: str, modules_keys):
16586    if hotkey not in modules_keys.values():
16587        print("Commune validator key not found")
16588        return False
16589    return True
16590
16591def authenticate_with_bittensor(hotkey, metagraph):
16592    if hotkey not in metagraph.hotkeys:
16593        return False
16594
16595    uid = metagraph.hotkeys.index(hotkey)
16596    if not metagraph.validator_permit[uid] and NETWORK != "test":
16597        print("Bittensor validator permit required")
16598        return False
16599    
16600    if metagraph.S[uid] < 1000 and NETWORK != "test":
16601        print("Bittensor validator requires 1000+ staked TAO")
16602        return False
16603    
16604    return True
16605
16606def authenticate_with_commune(hotkey, commune_keys):
16607    if ENABLE_COMMUNE and not check_commune_validator_hotkey(hotkey, commune_keys):
16608        return False
16609    return True
16610
16611def update_commune_keys(commune_client, commune_keys):
16612    try:
16613        return commune_client.query_map_key(COMMUNE_NETUID)
16614    except Exception as err:
16615        print("Error during commune keys update", str(err))
16616        return commune_keys
16617
16618async def run_focus_scoring(
16619    video_id: Annotated[str, Body()],
16620    focusing_task: Annotated[str, Body()],
16621    focusing_description: Annotated[str, Body()]
16622) -> Dict[str, Any]:
16623
16624    score_details = None
16625    embeddings = None
16626    try:
16627        score_details, embeddings = await focus_scoring_service.score_video(video_id, focusing_task, focusing_description)
16628        print(f"Score for focus video <{video_id}>: {score_details.final_score}")
16629        MIN_FINAL_SCORE = 0.1
16630        # todo: measure and tune these
16631        MIN_TASK_UNIQUENESS_SCORE = 0
16632        MIN_VIDEO_UNIQUENESS_SCORE = 0
16633        # get the db after scoring the video so it's not open for too long
16634        with get_db_context() as db:
16635            if score_details.final_score < MIN_FINAL_SCORE:
16636                rejection_reason = f"""This video got a score of {score_details.final_score * 100:.2f}%, which is lower than the minimum score of {MIN_FINAL_SCORE * 100}%.
16637Feedback from AI: {score_details.completion_score_breakdown.rationale}"""
16638                mark_video_rejected(
16639                    db,
16640                    video_id,
16641                    rejection_reason,
16642                    score_details=score_details,
16643                    embeddings=embeddings
16644                )
16645            else:
16646                set_focus_video_score(db, video_id, score_details, embeddings)
16647        return { "success": True }
16648
16649    except Exception as e:
16650        exception_string = traceback.format_exc()
16651        error_string = f"{str(e)}\n{exception_string}"
16652        print(f"Error scoring focus video <{video_id}>: {error_string}")
16653        with get_db_context() as db:
16654            mark_video_rejected(
16655                db,
16656                video_id,
16657                "Task recording is not unique. If you believe this is an error, please contact a team member." if isinstance(e, VideoUniquenessError) else "Error scoring video",
16658                score_details=score_details,
16659                embeddings=embeddings,
16660                exception_string=exception_string,
16661            )
16662        return { "success": False, "error": error_string }
16663
16664async def main():
16665    app = FastAPI()
16666    # Mount the static directory to serve static files
16667    app.mount("/static", StaticFiles(directory="validator-api/static"), name="static")
16668
16669    subtensor = bittensor.subtensor(network=NETWORK)
16670    metagraph: bittensor.metagraph = subtensor.metagraph(NETUID)
16671
16672    commune_client = None
16673    commune_keys = None
16674    if ENABLE_COMMUNE:
16675        commune_client = CommuneClient(get_node_url(use_testnet=True if COMMUNE_NETWORK == "test" else False))
16676        commune_keys = update_commune_keys(commune_client, commune_keys)
16677
16678    async def resync_metagraph():
16679        while True:
16680            """Resyncs the metagraph and updates the hotkeys and moving averages based on the new metagraph."""
16681            print("resync_metagraph()")
16682
16683            try:
16684                # Sync the metagraph.
16685                print("syncing metagraph")
16686                metagraph.sync(subtensor=subtensor)
16687                print("metagraph synced")
16688
16689                # Sync latest commune keys
16690                if ENABLE_COMMUNE:
16691                    commune_keys = update_commune_keys(commune_client, commune_keys)
16692                    print("commune keys synced")
16693            
16694            # In case of unforeseen errors, the api will log the error and continue operations.
16695            except Exception as err:
16696                print("Error during metagraph sync", str(err))
16697                print_exception(type(err), err, err.__traceback__)
16698
16699            await asyncio.sleep(90)
16700    
16701    @app.on_event("shutdown")
16702    async def shutdown_event():
16703        print("Shutdown event fired, attempting dataset upload of current batch.")
16704        video_dataset_uploader.submit()
16705        audio_dataset_uploader.submit()
16706
16707    @app.get("/sentry-debug")
16708    async def trigger_error():
16709        division_by_zero = 1 / 0
16710
16711    @app.post("/api/get_pinecone_novelty")
16712    async def get_pinecone_novelty(
16713        metadata: List[VideoMetadata],
16714        hotkey: Annotated[str, Depends(get_hotkey)],
16715    ) -> List[float]:
16716        print("get_pinecone_novelty()")
16717        
16718        if not authenticate_with_bittensor(hotkey, metagraph) and not authenticate_with_commune(hotkey, commune_keys):
16719            raise HTTPException(
16720                status_code=status.HTTP_403_FORBIDDEN,
16721                detail=f"Valid hotkey required.",
16722            )
16723        
16724        uid = None
16725        if ENABLE_COMMUNE and hotkey in commune_keys.values():
16726            # get uid of commune validator
16727            for key_uid, key_hotkey in commune_keys.items():
16728                if key_hotkey == hotkey:
16729                    uid = key_uid
16730                    break
16731            validator_chain = "commune"
16732        elif uid is None and hotkey in metagraph.hotkeys:
16733            # get uid of bittensor validator
16734            uid = metagraph.hotkeys.index(hotkey)
16735            validator_chain = "bittensor"
16736        
16737        start_time = time.time()
16738        # query the pinecone index to get novelty scores
16739        novelty_scores = await score.get_pinecone_novelty(metadata)
16740        print(f"Returning novelty scores={novelty_scores} for {validator_chain} validator={uid} in {time.time() - start_time:.2f}s")
16741        return novelty_scores
16742
16743    @app.post("/api/upload_video_metadata")
16744    async def upload_video_metadata(
16745        upload_data: VideoMetadataUpload,
16746        hotkey: Annotated[str, Depends(get_hotkey)],
16747    ) -> bool:
16748        print("upload_video_metadata()")
16749        if not authenticate_with_bittensor(hotkey, metagraph) and not authenticate_with_commune(hotkey, commune_keys):
16750            raise HTTPException(
16751                status_code=status.HTTP_403_FORBIDDEN,
16752                detail=f"Valid hotkey required.",
16753            )
16754        
16755        uid = None
16756        is_bittensor = 0
16757        is_commune = 0
16758        if ENABLE_COMMUNE and hotkey in commune_keys.values():
16759            # get uid of commune validator
16760            for key_uid, key_hotkey in commune_keys.items():
16761                if key_hotkey == hotkey:
16762                    uid = key_uid
16763                    break
16764            validator_chain = "commune"
16765            is_commune = 1
16766        elif uid is None and hotkey in metagraph.hotkeys:
16767            # get uid of bittensor validator
16768            uid = metagraph.hotkeys.index(hotkey)
16769            validator_chain = "bittensor"
16770            is_bittensor = 1
16771
16772        metadata = upload_data.metadata
16773        description_relevance_scores = upload_data.description_relevance_scores
16774        query_relevance_scores = upload_data.query_relevance_scores
16775        topic_query = upload_data.topic_query
16776
16777        start_time = time.time()
16778        video_ids = await score.upload_video_metadata(metadata, description_relevance_scores, query_relevance_scores, topic_query)
16779        print(f"Uploaded {len(video_ids)} video metadata from {validator_chain} validator={uid} in {time.time() - start_time:.2f}s")
16780        
16781        if upload_data.miner_hotkey is not None:
16782            # Calculate and upsert leaderboard data
16783            datapoints = len(video_ids)
16784            avg_desc_relevance = sum(description_relevance_scores) / len(description_relevance_scores)
16785            avg_query_relevance = sum(query_relevance_scores) / len(query_relevance_scores)
16786            novelty_score = upload_data.novelty_score
16787            total_score = upload_data.total_score
16788            miner_hotkey = upload_data.miner_hotkey
16789            
16790            try:
16791                start_time = time.time()
16792                connection = connect_to_db()
16793
16794                leaderboard_table_name = "miner_leaderboard"
16795                if not IS_PROD:
16796                    leaderboard_table_name += "_test"
16797                query = f"""
16798                INSERT INTO {leaderboard_table_name} (
16799                    hotkey,
16800                    is_bittensor,
16801                    is_commune,
16802                    datapoints,
16803                    avg_desc_relevance,
16804                    avg_query_relevance,
16805                    avg_novelty,
16806                    avg_score,
16807                    last_updated
16808                ) VALUES (
16809                    %s, %s, %s, %s, %s, %s, %s, %s, NOW()
16810                ) ON DUPLICATE KEY UPDATE
16811                    datapoints = datapoints + VALUES(datapoints),
16812                    avg_desc_relevance = ((avg_desc_relevance * (datapoints - VALUES(datapoints))) + (VALUES(avg_desc_relevance) * VALUES(datapoints))) / datapoints,
16813                    avg_query_relevance = ((avg_query_relevance * (datapoints - VALUES(datapoints))) + (VALUES(avg_query_relevance) * VALUES(datapoints))) / datapoints,
16814                    avg_novelty = ((avg_novelty * (datapoints - VALUES(datapoints))) + (VALUES(avg_novelty) * VALUES(datapoints))) / datapoints,
16815                    avg_score = ((avg_score * (datapoints - VALUES(datapoints))) + (VALUES(avg_score) * VALUES(datapoints))) / datapoints,
16816                    last_updated = NOW();
16817                """
16818                cursor = connection.cursor()
16819                cursor.execute(query, (
16820                    miner_hotkey,
16821                    is_bittensor,
16822                    is_commune,
16823                    datapoints,
16824                    avg_desc_relevance,
16825                    avg_query_relevance,
16826                    novelty_score,
16827                    total_score
16828                ))
16829                connection.commit()
16830                print(f"Upserted leaderboard data for {miner_hotkey} from {validator_chain} validator={uid} in {time.time() - start_time:.2f}s")
16831                
16832            except mysql.connector.Error as err:
16833                raise HTTPException(status_code=500, detail=f"Error fetching data from MySQL database: {err}")
16834            finally:
16835                if connection:
16836                    connection.close()
16837        else:
16838            print("Skipping leaderboard update because either non-production environment or vali running outdated code.")
16839        
16840        return True
16841
16842
16843    @app.post("/api/upload_audio_metadata")
16844    async def upload_audio_metadata(
16845        upload_data: AudioMetadataUpload,
16846        hotkey: Annotated[str, Depends(get_hotkey)],
16847    ) -> bool:
16848        print("upload_audio_metadata()")
16849        
16850        if not authenticate_with_bittensor(hotkey, metagraph) and not authenticate_with_commune(hotkey, commune_keys):
16851            raise HTTPException(
16852                status_code=status.HTTP_403_FORBIDDEN,
16853                detail=f"Valid hotkey required.",
16854            )
16855        
16856        uid = None
16857        is_bittensor = 0
16858        is_commune = 0
16859        if ENABLE_COMMUNE and hotkey in commune_keys.values():
16860            # get uid of commune validator
16861            for key_uid, key_hotkey in commune_keys.items():
16862                if key_hotkey == hotkey:
16863                    uid = key_uid
16864                    break
16865            validator_chain = "commune"
16866            is_commune = 1
16867        elif uid is None and hotkey in metagraph.hotkeys:
16868            # get uid of bittensor validator
16869            uid = metagraph.hotkeys.index(hotkey)
16870            validator_chain = "bittensor"
16871            is_bittensor = 1
16872
16873        metadata = upload_data.metadata
16874        inverse_der = upload_data.inverse_der
16875        audio_length_score = upload_data.audio_length_score
16876        audio_quality_total_score = upload_data.audio_quality_total_score
16877        audio_query_score = upload_data.audio_query_score
16878        topic_query = upload_data.topic_query
16879        total_score = upload_data.total_score
16880
16881        start_time = time.time()
16882        audio_ids = await score.upload_audio_metadata(metadata, inverse_der, audio_length_score, audio_quality_total_score, audio_query_score, topic_query, total_score)
16883        print(f"Uploaded {len(audio_ids)} audio metadata from {validator_chain} validator={uid} in {time.time() - start_time:.2f}s")
16884        
16885        if upload_data.miner_hotkey is not None:
16886            # Calculate and upsert leaderboard data
16887            datapoints = len(audio_ids)
16888            total_score = upload_data.total_score
16889            miner_hotkey = upload_data.miner_hotkey
16890            
16891            try:
16892                start_time = time.time()
16893                connection = connect_to_db()
16894
16895                leaderboard_table_name = "miner_leaderboard_audio"
16896                if not IS_PROD:
16897                    leaderboard_table_name += "_test"
16898                query = f"""
16899                INSERT INTO {leaderboard_table_name} (
16900                    hotkey,
16901                    is_bittensor,
16902                    is_commune,
16903                    datapoints,
16904                    avg_der,
16905                    avg_length_score,
16906                    avg_quality_score,
16907                    avg_query_score,
16908                    avg_score,
16909                    last_updated
16910                ) VALUES (
16911                    %s, %s, %s, %s, %s, %s, %s, %s, %s, NOW()
16912                ) ON DUPLICATE KEY UPDATE
16913                    datapoints = datapoints + VALUES(datapoints),
16914                    avg_der = ((avg_der * (datapoints - VALUES(datapoints))) + (VALUES(avg_der) * VALUES(datapoints))) / datapoints,
16915                    avg_length_score = ((avg_length_score * (datapoints - VALUES(datapoints))) + (VALUES(avg_length_score) * VALUES(datapoints))) / datapoints,
16916                    avg_quality_score = ((avg_quality_score * (datapoints - VALUES(datapoints))) + (VALUES(avg_quality_score) * VALUES(datapoints))) / datapoints,
16917                    avg_query_score = ((avg_query_score * (datapoints - VALUES(datapoints))) + (VALUES(avg_query_score) * VALUES(datapoints))) / datapoints,
16918                    avg_score = ((avg_score * (datapoints - VALUES(datapoints))) + (VALUES(avg_score) * VALUES(datapoints))) / datapoints,
16919                    last_updated = NOW();
16920                """
16921                cursor = connection.cursor()
16922                cursor.execute(query, (
16923                    miner_hotkey,
16924                    is_bittensor,
16925                    is_commune,
16926                    datapoints,
16927                    inverse_der,
16928                    audio_length_score,
16929                    audio_quality_total_score,
16930                    audio_query_score,
16931                    total_score
16932                ))
16933                connection.commit()
16934                print(f"Upserted leaderboard data for {miner_hotkey} from {validator_chain} validator={uid} in {time.time() - start_time:.2f}s")
16935                
16936            except mysql.connector.Error as err:
16937                raise HTTPException(status_code=500, detail=f"Error fetching data from MySQL database: {err}")
16938            finally:
16939                if connection:
16940                    connection.close()
16941        else:
16942            print("Skipping leaderboard update because either non-production environment or vali running outdated code.")
16943        
16944        return True
16945
16946    @app.post("/api/get_proxy")
16947    async def get_proxy(
16948        hotkey: Annotated[str, Depends(get_hotkey)]
16949    ) -> str:
16950        
16951        if not authenticate_with_bittensor(hotkey, metagraph) and not authenticate_with_commune(hotkey, commune_keys):
16952            raise HTTPException(
16953                status_code=status.HTTP_403_FORBIDDEN,
16954                detail=f"Valid hotkey required.",
16955            )
16956        
16957        return random.choice(PROXY_LIST)
16958    
16959    ################ START OMEGA FOCUS ENDPOINTS ################
16960    @app.post("/api/focus/get_focus_score")
16961    async def get_focus_score(
16962        api_key: str = Security(get_focus_api_key),
16963        video_id: Annotated[str, Body()] = None,
16964        focusing_task: Annotated[str, Body()] = None,
16965        focusing_description: Annotated[str, Body()] = None,
16966        background_tasks: BackgroundTasks = BackgroundTasks(),
16967    ) -> Dict[str, bool]:
16968        background_tasks.add_task(run_focus_scoring, video_id, focusing_task, focusing_description)
16969        return { "success": True }
16970
16971    @app.get("/api/focus/get_list")
16972    @limiter.limit("1000/minute")
16973    async def _get_available_focus_video_list(
16974        request: Request,
16975        db: Session=Depends(get_db)
16976    ):
16977        """
16978        Return all available focus videos
16979        """
16980        return await get_all_available_focus(db)
16981
16982    # FV TODO: let's do proper miner auth here instead, and then from the retrieved hotkey, we can also
16983    # retrieve the coldkey and use that to confirm the transfer
16984    @app.post("/api/focus/purchase")
16985    @limiter.limit("20/minute")
16986    async def purchase_video(
16987        request: Request,
16988        background_tasks: BackgroundTasks,
16989        video_id: Annotated[str, Body()],
16990        miner_hotkey: Annotated[str, Body()],
16991    ):
16992        if await already_purchased_max_focus_tao():
16993            print("Purchases in the last 24 hours have reached the max focus tao limit.")
16994            raise HTTPException(400, "Purchases in the last 24 hours have reached the max focus tao limit, please try again later.")
16995
16996        with get_db_context() as db:
16997            availability = await check_availability(db, video_id, miner_hotkey, True) # run with_lock True
16998            print('availability', availability)
16999            if availability['status'] == 'success':
17000                amount = availability['price']
17001                video_owner_coldkey = get_video_owner_coldkey(db, video_id) # run with_lock True
17002                background_tasks.add_task(confirm_video_purchased, video_id, True) # run with_lock True
17003                return {
17004                    'status': 'success',
17005                    'address': video_owner_coldkey,
17006                    'amount': amount,
17007                }
17008            else:
17009                return availability
17010        
17011    @app.post("/api/focus/revert-pending-purchase")
17012    @limiter.limit("100/minute")
17013    async def revert_pending_purchase(
17014        request: Request,
17015        video: VideoPurchaseRevert,
17016        db: Session=Depends(get_db),
17017    ):
17018        return mark_video_submitted(db, video.video_id, True) # run with_lock True
17019        
17020    @app.post("/api/focus/verify-purchase")
17021    @limiter.limit("100/minute")
17022    async def verify_purchase(
17023        request: Request,
17024        miner_hotkey: Annotated[str, Body()],
17025        video_id: Annotated[str, Body()],
17026        block_hash: Annotated[str, Body()],
17027        db: Session=Depends(get_db),
17028    ):
17029        video_owner_coldkey = get_video_owner_coldkey(db, video_id) # run with_lock True
17030        result = await confirm_transfer(db, video_owner_coldkey, video_id, miner_hotkey, block_hash)
17031        if result:
17032            return {
17033                'status': 'success',
17034                'message': 'Video purchase verification was successful'
17035            }
17036        else:
17037            return {
17038                'status': 'error',
17039                'message': f'Video purchase verification failed for video_id {video_id} on block_hash {block_hash} by miner_hotkey {miner_hotkey}'
17040            }
17041
17042    @app.get('/api/focus/miner_purchase_scores/{miner_hotkey_list}')
17043    async def miner_purchase_scores(
17044        miner_hotkey_list: str,
17045        db: Session = Depends(get_db)
17046    ) -> Dict[str, MinerPurchaseStats]:
17047        return {
17048            hotkey: await get_miner_purchase_stats(db, hotkey)
17049            for hotkey in miner_hotkey_list.split(',')
17050        }
17051
17052    class TaskTypeMap(BaseModel):
17053        task_type_map: Dict[TaskType, float]
17054
17055    @app.get('/api/focus/get_task_percentage_map')
17056    def get_task_percentage_map():
17057        return TaskTypeMap(task_type_map=TASK_TYPE_MAP)
17058
17059    @app.get('/api/focus/get_rewards_percent')
17060    async def get_rewards_percent():
17061        return FOCUS_REWARDS_PERCENT
17062    
17063    @app.get('/api/focus/get_max_focus_tao')
17064    async def _get_max_focus_tao() -> float:
17065        return await get_max_focus_tao()
17066    
17067    @app.get('/api/focus/get_purchase_max_focus_tao')
17068    async def _get_purchase_max_focus_tao() -> float:
17069        return await get_purchase_max_focus_tao()
17070    
17071    async def cache_max_focus_tao():
17072        while True:
17073            """Re-caches the value of max_focus_tao."""
17074            print("cache_max_focus_tao()")
17075
17076            max_attempts = 3
17077            attempt = 0
17078
17079            while attempt < max_attempts:
17080                try:
17081                    max_focus_tao = await get_max_focus_tao()
17082                    break  # Exit the loop if the function succeeds
17083
17084                # In case of unforeseen errors, the api will log the error and continue operations.
17085                except Exception as err:
17086                    attempt += 1
17087                    print(f"Error during recaching of max_focus_tao (Attempt {attempt}/{max_attempts}):", str(err))
17088
17089                    if attempt >= max_attempts:
17090                        print("Max attempts reached. Skipping this caching this cycle.")
17091                        break
17092
17093            # Sleep in seconds
17094            await asyncio.sleep(1800) # 30 minutes
17095    ################ END OMEGA FOCUS ENDPOINTS ################
17096    
17097    """ TO BE DEPRECATED """
17098    @app.post("/api/validate")
17099    async def validate(
17100        videos: Videos,
17101        hotkey: Annotated[str, Depends(get_hotkey)],
17102    ) -> float:
17103        if not authenticate_with_bittensor(hotkey, metagraph):
17104            raise HTTPException(
17105                status_code=status.HTTP_403_FORBIDDEN,
17106                detail=f"Valid hotkey required.",
17107            )
17108        uid = metagraph.hotkeys.index(hotkey)
17109        
17110        start_time = time.time()
17111        
17112        youtube_rewards = await score.score_and_upload_videos(videos, await imagebind_loader.get_imagebind())
17113
17114        if youtube_rewards is None:
17115            print("YouTube rewards are empty, returning None")
17116            return None
17117        
17118        total_rewards: float = youtube_rewards
17119        
17120        print(f"Total Rewards: {total_rewards}")
17121        print(f"Returning score={total_rewards} for validator={uid} in {time.time() - start_time:.2f}s")
17122
17123        return total_rewards
17124
17125    if not IS_PROD:
17126        @app.get("/api/count_unique")
17127        async def count_unique(
17128            videos: Videos,
17129        ) -> str:
17130            nunique = await score.get_num_unique_videos(videos)
17131            return f"{nunique} out of {len(videos.video_metadata)} submitted videos are unique"
17132
17133        @app.get("/api/check_score")
17134        async def check_score(
17135            videos: Videos,
17136        ) -> dict:
17137            detailed_score = await score.score_videos_for_testing(videos, await imagebind_loader.get_imagebind())
17138            return detailed_score
17139
17140    @app.get("/api/topic")
17141    async def get_topic() -> str:
17142        return random.choice(TOPICS_LIST)
17143    
17144    @app.get("/api/topics")
17145    async def get_topics() -> List[str]:
17146        return TOPICS_LIST
17147
17148    @app.get("/")
17149    def healthcheck():
17150        return datetime.utcnow()
17151
17152    ################ START MULTI-MODAL API / OPENTENSOR CONNECTOR ################
17153    @app.get("/api/mm/topics")
17154    async def get_mm_topics(api_key: str = Security(get_api_key)):
17155        try:
17156            connection = connect_to_db()
17157            query = f"SELECT DISTINCT query FROM omega_multimodal"
17158            cursor = connection.cursor()
17159            cursor.execute(query)
17160            data = [row[0] for row in cursor.fetchall()]
17161            
17162            cursor.close()
17163            connection.close()
17164            return data
17165        except mysql.connector.Error as err:
17166            raise HTTPException(status_code=500, detail=f"Error fetching data from MySQL database: {err}")
17167        
17168
17169    @app.get("/api/mm/topic_video_count")
17170    async def get_mm_topic_video_count(api_key: str = Security(get_api_key)):
17171        try:
17172            connection = connect_to_db()
17173            query = f"SELECT query, COUNT(*) AS num_videos FROM omega_multimodal GROUP BY query"
17174            cursor = connection.cursor(dictionary=True)
17175            cursor.execute(query)
17176            data = cursor.fetchall()
17177            
17178            cursor.close()
17179            connection.close()
17180            return data        
17181        except mysql.connector.Error as err:
17182            raise HTTPException(status_code=500, detail=f"Error fetching data from MySQL database: {err}")
17183
17184
17185    @app.get("/api/mm/topic_relevant/{topic}")
17186    async def get_mm_topic_relevant(api_key: str = Security(get_api_key), topic: str = Path(...)):
17187        try:
17188            connection = connect_to_db()
17189            query = f"SELECT video_id, youtube_id, description, start_time, end_time FROM omega_multimodal where query = '{topic}' ORDER BY query_relevance_score DESC LIMIT 100"
17190            cursor = connection.cursor(dictionary=True)
17191            cursor.execute(query)
17192            data = cursor.fetchall()
17193            
17194            cursor.close()
17195            connection.close()
17196            return data        
17197        except mysql.connector.Error as err:
17198            raise HTTPException(status_code=500, detail=f"Error fetching data from MySQL database: {err}")
17199    ################ END MULTI-MODAL API / OPENTENSOR CONNECTOR ################
17200
17201    ################ START LEADERBOARD ################
17202    @app.get("/api/leaderboard")
17203    async def get_leaderboard_data(hotkey: Optional[str] = None, sort_by: Optional[str] = None, sort_order: Optional[str] = None):
17204        try:
17205            leaderboard_table_name = "miner_leaderboard"
17206            if not IS_PROD:
17207                leaderboard_table_name += "_test"
17208            connection = connect_to_db()
17209            query = f"SELECT * FROM {leaderboard_table_name}"
17210            params = []
17211
17212            # Filter by hotkey if provided
17213            if hotkey:
17214                query += " WHERE hotkey = %s"
17215                params.append(hotkey)
17216
17217            # Sort by the specified column if provided, default to 'datapoints'
17218            sort_column = "datapoints"  # Default sort column
17219            sort_order = "DESC"  # Default sort order
17220            if sort_by:
17221                # Validate and map sort_by to actual column names if necessary
17222                valid_sort_columns = {
17223                    "datapoints": "datapoints",
17224                    "avg_desc_relevance": "avg_desc_relevance",
17225                    "avg_query_relevance": "avg_query_relevance",
17226                    "avg_novelty": "avg_novelty",
17227                    "avg_score": "avg_score",
17228                    "last_updated": "last_updated"
17229                }
17230                sort_column = valid_sort_columns.get(sort_by, sort_column)
17231            if sort_order:
17232                # Validate and map sort_order to actual values if necessary
17233                valid_sort_orders = {
17234                    "asc": "ASC",
17235                    "desc": "DESC"
17236                }
17237                sort_order = valid_sort_orders.get(sort_order.lower(), sort_order)
17238            
17239            query += f" ORDER BY {sort_column} {sort_order}"
17240
17241            cursor = connection.cursor(dictionary=True)
17242            cursor.execute(query, params)
17243            data = cursor.fetchall()
17244            
17245            cursor.close()
17246            connection.close()
17247            return data        
17248        except mysql.connector.Error as err:
17249            raise HTTPException(status_code=500, detail=f"Error fetching data from MySQL database: {err}")
17250    
17251    @app.get("/leaderboard")
17252    def leaderboard():
17253        return FileResponse('./validator-api/static/leaderboard.html')
17254    
17255    @app.get("/api/leaderboard-dataset-data")
17256    async def get_leaderboard_dataset_data():
17257        try:
17258            connection = connect_to_db()
17259            query = "SELECT * FROM hf_dataset_snapshots ORDER BY snapshot_date ASC"
17260            cursor = connection.cursor(dictionary=True)
17261            cursor.execute(query)
17262            data = cursor.fetchall()
17263            
17264            cursor.close()
17265            connection.close()
17266            return data        
17267        except mysql.connector.Error as err:
17268            raise HTTPException(status_code=500, detail=f"Error fetching leaderboard dataset data from MySQL database: {err}")
17269        
17270    @app.get("/api/leaderboard-miner-data")
17271    async def get_leaderboard_miner_data(hotkey: Optional[str] = None):
17272        try:
17273            connection = connect_to_db()
17274            params = []
17275
17276            query = "SELECT * FROM miner_leaderboard_snapshots wHERE 1=1"
17277
17278             # Filter by hotkey if provided
17279            if hotkey:
17280                query += " AND hotkey = %s"
17281                params.append(hotkey)
17282
17283            query += " ORDER BY snapshot_date ASC"
17284
17285            cursor = connection.cursor(dictionary=True)
17286            cursor.execute(query, params)
17287            data = cursor.fetchall()
17288            
17289            cursor.close()
17290            connection.close()
17291            return data        
17292        except mysql.connector.Error as err:
17293            raise HTTPException(status_code=500, detail=f"Error fetching leaderboard miner data from MySQL database: {err}")
17294        
17295    @app.get("/api/leaderboard-focus-data")
17296    async def get_leaderboard_focus_data():
17297        try:
17298            connection = connect_to_db()
17299            query = "SELECT * FROM focus_kpi_snapshots ORDER BY snapshot_date ASC"
17300            cursor = connection.cursor(dictionary=True)
17301            cursor.execute(query)
17302            data = cursor.fetchall()
17303            
17304            cursor.close()
17305            connection.close()
17306            return data        
17307        except mysql.connector.Error as err:
17308            raise HTTPException(status_code=500, detail=f"Error fetching focus kpi data from MySQL database: {err}")
17309    ################ END LEADERBOARD ################
17310
17311    ################ START DASHBOARD ################
17312    async def resync_dataset():
17313        while True:
17314            """Resyncs the dataset by updating our JSON data source from the huggingface dataset."""
17315            print("resync_dataset()")
17316
17317            max_attempts = 3
17318            attempt = 0
17319
17320            while attempt < max_attempts:
17321                try:
17322                    pull_and_cache_dataset()
17323                    break  # Exit the loop if the function succeeds
17324
17325                # In case of unforeseen errors, the api will log the error and continue operations.
17326                except Exception as err:
17327                    attempt += 1
17328                    print(f"Error during dataset sync (Attempt {attempt}/{max_attempts}):", str(err))
17329                    #print_exception(type(err), err, err.__traceback__)
17330
17331                    if attempt >= max_attempts:
17332                        print("Max attempts reached. Skipping this sync cycle.")
17333                        break
17334
17335            # Sleep in seconds
17336            await asyncio.sleep(1800) # 30 minutes
17337
17338    @app.get("/dashboard/get-video-metadata")
17339    async def get_video_metadata(
17340        sort_by: Optional[str] = "submitted_at",
17341        sort_order: Optional[str] = "desc",
17342        page: Optional[int] = 1,
17343        items_per_page: Optional[int] = 50
17344    ):
17345        print("get_video_metadata()")
17346        if os.path.exists(CACHE_FILE):
17347            with open(CACHE_FILE, "r") as f:
17348                descriptions = json.load(f)
17349            
17350            # Define a mapping from sort_by parameter to the index in the metadata list
17351            sort_index_mapping = {
17352                "video_id": 0,
17353                "youtube_id": 1,
17354                "start_time": 2,
17355                "end_time": 3,
17356                "description": 4,
17357                "description_relevance_score": 5,
17358                "query_relevance_score": 6,
17359                "query": 7,
17360                "submitted_at": 8
17361            }
17362            
17363            if sort_by and sort_by in sort_index_mapping:
17364                index = sort_index_mapping[sort_by]
17365                reverse = sort_order == "desc"
17366                descriptions.sort(key=lambda x: x[index], reverse=reverse)
17367            
17368            # Pagination logic
17369            total_items = len(descriptions)
17370            start = (page - 1) * items_per_page
17371            end = start + items_per_page
17372            paginated_descriptions = descriptions[start:end]
17373
17374            for video in paginated_descriptions:
17375                video[0] = ".." + str(video[0])[:6]
17376                video[5] = round(video[5], 4)  # Round description_relevance_score
17377                video[6] = round(video[6], 4)  # Round query_relevance_score
17378                date_time = datetime.fromtimestamp(video[8])
17379                video[8] = date_time.strftime('%Y-%m-%d %H:%M:%S')  # Format submitted_at
17380            
17381            return {
17382                "total_items": total_items,
17383                "page": page,
17384                "items_per_page": items_per_page,
17385                "data": paginated_descriptions
17386            }
17387        else:
17388            return {"error": "Cache file not found"}
17389    
17390    @app.get("/dashboard")
17391    def dashboard():
17392        print("dashboard()")
17393        return FileResponse('validator-api/static/dashboard.html')
17394    ################ END DASHBOARD ################
17395
17396    async def run_server():
17397        print("run_server()")
17398        config = uvicorn.Config(app=app, host="0.0.0.0", port=8001)
17399        server = uvicorn.Server(config)
17400        await server.serve()
17401    
17402    server_task = asyncio.create_task(run_server())
17403    try:
17404        # Wait for the server to start
17405        tasks_list = [
17406            server_task,
17407            resync_metagraph(),
17408            cache_max_focus_tao(),
17409        ]
17410        if IS_PROD:
17411            tasks_list.append(resync_dataset())
17412        await asyncio.gather(*tasks_list)
17413    except asyncio.CancelledError:
17414        server_task.cancel()
17415        await server_task
17416
17417if __name__ == "__main__":
17418    asyncio.run(main())
17419
17420
17421
17422---
17423File: /validator-api/clear_index.py
17424---
17425
17426from validator_api import config
17427from pinecone import Pinecone
17428
17429PINECONE_INDEX = Pinecone(api_key=config.PINECONE_API_KEY).Index(config.PINECONE_INDEX)
17430PINECONE_INDEX.delete(delete_all=True)
17431
17432
17433
17434---
17435File: /validator-api/test_search_and_submit.py
17436---
17437
17438from omega.miner_utils import search_and_embed_youtube_videos, ImageBind, video_utils
17439from omega.protocol import Videos
17440from validator_api.dataset_upload import dataset_uploader
17441from validator_api.score import score_and_upload_videos
17442import asyncio
17443import time
17444
17445imagebind = ImageBind()
17446start = time.time()
17447query = "minecraft gameplay footage"
17448num_videos = 1
17449video_metadata_list = search_and_embed_youtube_videos(query, num_videos, imagebind)
17450print(f"Search and embed took {time.time() - start} seconds")
17451
17452videos = Videos(
17453    query=query,
17454    num_videos=num_videos,
17455    video_metadata=video_metadata_list,
17456)
17457
17458# dataset_uploader.min_batch_size = 2  # override to force upload
17459# dataset_uploader.desired_batch_size = 2  # override to force upload
17460# print(asyncio.run(score_and_upload_videos(videos, imagebind)))
17461
17462
17463
17464---
17465File: /auto_updating_validator.sh
17466---
17467
17468#!/bin/bash
17469
17470VALIDATOR_ARGS=$@
17471
17472# first, git pull
17473git pull
17474
17475# next, set up environment
17476pip install -e .
17477
17478# finally, run the validator
17479python neurons/validator.py $VALIDATOR_ARGS --neuron.auto_update
17480
17481
17482
17483---
17484File: /purchase_focus_video.py
17485---
17486
17487"""
17488Using the OMEGA Focus Video Purchase System:
17489
174901. Setup:
17491   - Ensure you have the latest required libraries installed. See requirements.txt.
17492   - Make sure you have your SN24 Bittensor wallet set up. You *MUST* use your SN24 registered wallet to purchase videos.
17493
174942. Running the Script:
17495   - Open a terminal and navigate to the directory containing the script.
17496   - Run the script with: `python purchase_focus_video.py`
17497
174983. Main Menu Options:
17499   When you run the script, you'll see a menu with 5 options:
17500
17501   1. View Focus Videos
17502   2. Purchase Focus Video
17503   3. Verify Purchase
17504   4. Display Order History
17505   5. Exit
17506
175074. Using the Options:
17508
17509   Option 1: View Focus Videos
17510   - Displays a list of available focus videos with details like Video ID, Score, Cost, and Expected Reward.
17511   - The displayed cost is the amount of TAO tokens required to purchase the video.
17512   - The expected reward is the amount of TAO tokens you'll earn from SN24 emissions for purchasing the video.
17513   - Select a number from the list next to the video you want to purchase.
17514
17515   Option 2: Purchase Focus Video
17516   - Allows you to purchase a video by entering its ID.
17517   - You'll need to provide your wallet information (name, hotkey, path).
17518   - The script will initiate a transfer of TAO tokens to the OMEGA Focus App user who created the video. This secures the purchase of the video.
17519   - After the transfer is complete, the script will attempt to verify the purchase. 
17520   - Once successful, you're all set! SN24 validators will automatically detect your purchase and reward your expected TAO emissions.
17521
17522   Option 3: Verify Purchase
17523   - This option is used when there are issues with the purchase verification during the purchase process. 
17524   - If you've successfully transferred the TAO tokens but the purchase wasn't verified, you can use this option to verify the purchase.
17525   - You'll need to provide the Video ID, Miner Hotkey, and Block Hash.
17526
17527   Option 4: Display Order History
17528   - Shows a list of your previous purchases and their current status.
17529
17530   Option 5: Exit
17531   - Closes the application.
17532
175335. Important Notes:
17534   - The script can be ran using Bittensor mainnet or testnet based on the SUBTENSOR_NETWORK variable. Set it to "test" for testnet. Set to None for mainnet.
17535   - Purchases are saved locally in '~/.omega/focus_videos.json'.
17536   - Always ensure you have sufficient TAO tokens in your wallet before making a purchase.
17537   - Once a purchase has been verified successful, SN24 validators will automatically detect your purchase and reward your expected TAO emissions.
17538
175396. Wallet Information:
17540   - When purchasing, you'll need to provide your Bittensor wallet details.
17541   - You *MUST* use your SN24 registered wallet to purchase videos.
17542   - The default wallet path is '~/.bittensor/wallets/'.
17543
17544Remember to keep your wallet information secure and never share your private keys.
17545"""
17546
17547import os
17548import requests
17549import bittensor as bt
17550from bittensor import wallet as btcli_wallet
17551import argparse
17552import time
17553import json
17554from tabulate import tabulate
17555from datetime import datetime
17556import multiprocessing
17557import sys
17558
17559parser = argparse.ArgumentParser(description='Interact with the OMEGA Focus Videos API.')
17560args = parser.parse_args()
17561
17562SUBTENSOR_NETWORK = None # "test" or None
17563
17564API_BASE = (
17565    "https://dev-validator.api.omega-labs.ai"
17566    if SUBTENSOR_NETWORK == "test" else
17567    "https://validator.api.omega-labs.ai"
17568)
17569
17570CYAN = "\033[96m"
17571GREEN = "\033[92m"
17572RED = "\033[91m"
17573RESET = "\033[0m"
17574
17575def initialize_subtensor():
17576    try:
17577        subtensor = bt.subtensor(network=SUBTENSOR_NETWORK)
17578        #print(f"{GREEN}Subtensor initialized successfully.{RESET}")
17579        return subtensor
17580    except Exception as e:
17581        print(f"{RED}Error initializing subtensor: {str(e)}{RESET}")
17582        raise
17583
17584def list_videos():
17585    videos_response = requests.get(
17586        API_BASE + "/api/focus/get_list",
17587        headers={"Content-Type": "application/json"},
17588        timeout=30
17589    )
17590
17591    if videos_response.status_code != 200:
17592        print(f"{RED}Error fetching focus videos: {videos_response.status_code}{RESET}")
17593        return None
17594    
17595    videos_data = videos_response.json()
17596    return videos_data
17597
17598def display_videos(videos_data):
17599    if not videos_data or len(videos_data) == 0:
17600        print(f"\n{RED}No videos available.{RESET}")
17601        return
17602
17603    print(f"\n{CYAN}Available Focus Videos:{RESET}")
17604    
17605    # Prepare the data for tabulate
17606    table_data = []
17607    for idx, video in enumerate(videos_data, 1):
17608        # Convert created_at to a more readable format
17609        created_at = datetime.fromisoformat(video['created_at'].replace('Z', '+00:00'))
17610        formatted_date = created_at.strftime("%Y-%m-%d %H:%M:%S")
17611        
17612        table_data.append([
17613            idx,
17614            video['video_id'],
17615            f"{video['video_score']:.3f}",
17616            f"{video['expected_reward_tao']:.5f}",
17617            f"{float(video['expected_reward_tao']) / 0.9:.5f}",
17618            #formatted_date
17619        ])
17620    
17621    # Create the table
17622    headers = ["#", "Video ID", "Score", "Cost (TAO)", "Expected Reward (TAO)"]
17623    table = tabulate(table_data, headers=headers, tablefmt="pretty")
17624    
17625    print(table)
17626
17627
17628class TransferTimeout(Exception):
17629    pass
17630
17631def reset_terminal():
17632    # Try multiple methods to reset the terminal
17633    os.system('stty sane')
17634    os.system('reset')
17635    sys.stdout.write('\033[0m')
17636    sys.stdout.flush()
17637
17638def transfer_operation(wallet, transfer_address_to, transfer_balance, result_queue):
17639    try:
17640        subtensor = initialize_subtensor()
17641        success, block_hash, err_msg = subtensor._do_transfer(
17642            wallet,
17643            transfer_address_to,
17644            transfer_balance,
17645            wait_for_finalization=True,
17646            wait_for_inclusion=True,
17647        )
17648        result_queue.put((success, block_hash, err_msg))
17649    except Exception as e:
17650        result_queue.put((False, None, str(e)))
17651
17652def transfer_with_timeout(wallet, transfer_address_to, transfer_balance):
17653    result_queue = multiprocessing.Queue()
17654    
17655    transfer_process = multiprocessing.Process(
17656        target=transfer_operation,
17657        args=(wallet, transfer_address_to, transfer_balance, result_queue)
17658    )
17659    
17660    transfer_process.start()
17661    transfer_process.join(timeout=150)  # 2m 30s = 150 seconds
17662    
17663    if transfer_process.is_alive():
17664        transfer_process.terminate()
17665        transfer_process.join()
17666        reset_terminal()
17667        print("\nTransfer operation timed out after 2 minutes 30 seconds. Exiting.")
17668    
17669    if not result_queue.empty():
17670        return result_queue.get()
17671    else:
17672        return False, None, "Transfer process exited without result"
17673
17674def purchase_video(video_id=None, wallet_name=None, wallet_hotkey=None, wallet_path=None):
17675    if not video_id:
17676        video_id = input(f"{CYAN}Enter focus video id: {RESET}")
17677
17678    if wallet_name is not None:
17679        name = wallet_name
17680    else:
17681        name = input(f"{CYAN}Enter wallet name (default: Coldkey): {RESET}") or "Coldkey"
17682    if wallet_hotkey is not None:
17683        hotkey_name = wallet_hotkey
17684    else:
17685        hotkey_name = input(f"{CYAN}Enter wallet hotkey name (default: Hotkey): {RESET}") or "Hotkey"
17686    if wallet_path is not None:
17687        path = wallet_path
17688    else:
17689        path = input(f"{CYAN}Enter wallet path (default: ~/.bittensor/wallets/): {RESET}") or "~/.bittensor/wallets/"
17690    
17691    wallet = btcli_wallet(name=name, hotkey=hotkey_name, path=path)
17692    try:
17693        hotkey = wallet.get_hotkey()
17694    except Exception as e:
17695        print(f"{RED}Error loading hotkey: {e} {RESET}")
17696        return
17697
17698    miner_hotkey = hotkey.ss58_address
17699    
17700    print(f"Purchasing video {video_id}...")
17701    print(f"{RED}You will only have 2 minutes and 30 seconds to complete the transfer of TAO tokens, otherwise the purchase will be reverted.{RESET}")
17702    purchase_response = requests.post(
17703        API_BASE + "/api/focus/purchase", 
17704        json={"video_id": video_id, "miner_hotkey": miner_hotkey}, 
17705        headers={"Content-Type": "application/json"},
17706        timeout=60
17707    )
17708
17709    purchase_data = purchase_response.json()
17710    if purchase_response.status_code != 200:
17711        print(f"{RED}Error purchasing video {video_id}: {purchase_response.status_code}{RESET}")
17712        if "detail" in purchase_data:
17713            print(f"{RED}Details: {purchase_data['detail']}{RESET}")
17714        return
17715    
17716    if "status" in purchase_data and purchase_data["status"] == "error":
17717        print(f"{RED}Error purchasing video {video_id}: {purchase_data['message']}{RESET}")
17718        return
17719    
17720    try:
17721        transfer_address_to = purchase_data["address"]
17722        transfer_amount = purchase_data["amount"]
17723
17724        print(f"Initiating transfer of {transfer_amount} TAO for video {video_id}...")
17725        
17726        transfer_balance = bt.Balance.from_tao(transfer_amount)
17727        
17728
17729        try:
17730            success, block_hash, err_msg = transfer_with_timeout(wallet, transfer_address_to, transfer_balance)
17731        except TransferTimeout:
17732            print(f"\n{RED}Transfer operation timed out after 2 minutes and 30 seconds. Aborting purchase.{RESET}")
17733            reset_terminal()
17734            revert_pending_purchase(video_id)
17735            repurchase_input(video_id, name, hotkey_name, path)
17736            return
17737        
17738        """
17739        success, block_hash, err_msg = subtensor._do_transfer(
17740            wallet,
17741            transfer_address_to,
17742            transfer_balance,
17743            wait_for_finalization=True,
17744            wait_for_inclusion=True,
17745        )
17746        """
17747
17748        if success:
17749            print(f"{GREEN}Transfer finalized. Block Hash: {block_hash}{RESET}")
17750            save_purchase_info(video_id, miner_hotkey, block_hash, "purchased", transfer_amount)
17751            verify_result = verify_purchase(video_id, miner_hotkey, block_hash)
17752            if not verify_result:
17753                print(f"{RED}There was an error verifying your purchase after successfully transferring TAO. Please try the 'Verify Purchase' option immediately and contact an admin if you are unable to successfully verify.{RESET}")
17754        else:
17755            print(f"{RED}Failed to complete transfer for video {video_id}.{RESET}")
17756            revert_pending_purchase(video_id)
17757            repurchase_input(video_id, name, hotkey_name, path)
17758
17759    except Exception as e:
17760        print(f"{RED}Error transferring TAO tokens: {str(e)}{RESET}")
17761        if "EOF occurred in violation of protocol" in str(e):
17762            print(f"{RED}Subtensor connection error detected. Re-initializing subtensor.{RESET}")
17763            initialize_subtensor()
17764        revert_pending_purchase(video_id)
17765        repurchase_input(video_id, name, hotkey_name, path)
17766
17767def revert_pending_purchase(video_id):
17768    print(f"Reverting Pending Purchasing of video {video_id}...")
17769    revert_response = requests.post(
17770        API_BASE + "/api/focus/revert-pending-purchase",
17771        json={"video_id": video_id},
17772        headers={"Content-Type": "application/json"},
17773        timeout=60
17774    )
17775    if revert_response.status_code != 200:
17776        print(f"{RED}Error reverting pending purchase of video {video_id}: {revert_response.status_code}{RESET}")
17777        return
17778    if revert_response.status_code == 200:
17779        print(f"{GREEN}Pending purchase of video {video_id} reverted successfully.{RESET}")
17780    return
17781
17782def repurchase_input(video_id, wallet_name=None, wallet_hotkey=None, wallet_path=None):
17783    repurchase = input(f"{CYAN}Do you want to repurchase video {video_id}? (y/n): {RESET}").lower()
17784    if repurchase == 'y':
17785        purchase_video(video_id, wallet_name, wallet_hotkey, wallet_path)
17786    elif repurchase != 'n':
17787        print(f"{RED}Invalid input. Please enter 'y' or 'n'.{RESET}")
17788        repurchase_input(video_id, wallet_name, wallet_hotkey, wallet_path)
17789
17790def display_saved_orders(for_verification=False):
17791    purchases_file = os.path.expanduser("~/.omega/focus_videos.json")
17792    if not os.path.exists(purchases_file):
17793        print(f"{RED}No saved orders found.{RESET}")
17794        return None
17795
17796    with open(purchases_file, 'r') as f:
17797        purchases = json.load(f)
17798
17799    if not purchases:
17800        print(f"{RED}No saved orders found.{RESET}")
17801        return None
17802
17803    purchases.sort(key=lambda x: x.get('created_at', ''), reverse=True)
17804
17805    print(f"\n{CYAN}Saved Orders:{RESET}")
17806    
17807    table_data = []
17808    for idx, purchase in enumerate(purchases, 1):
17809        created_at = purchase.get('created_at', 'N/A')
17810        if created_at != 'N/A':
17811            created_at = datetime.fromisoformat(created_at.replace('Z', '+00:00')).strftime("%Y-%m-%d %H:%M:%S")
17812        
17813        table_data.append([
17814            idx,
17815            purchase['video_id'],
17816            purchase['state'],
17817            purchase.get('amount', 'N/A'),
17818            f"{float(purchase.get('amount', 0)) / 0.9:.5f}",
17819            purchase.get('miner_hotkey', 'N/A')[:5] + '...' + purchase.get('miner_hotkey', 'N/A')[-5:],
17820            purchase['block_hash'][:5] + '...' + purchase['block_hash'][-5:],
17821            created_at
17822        ])
17823    
17824    headers = ["#", "Video ID", "Purchase State", "Cost (TAO)", "Estimated Reward (TAO)", "Purchasing Hotkey", "Block Hash", "Purchase Date"]
17825    table = tabulate(table_data, headers=headers, tablefmt="pretty")
17826    
17827    print(table)
17828    return purchases
17829
17830def select_order_for_verification():
17831    purchases = display_saved_orders()
17832
17833    while True:
17834        if purchases:
17835            print(f"*** NOTE: A purchase is finalized when the purchase state is 'verified'. ***")
17836            choice = input(f"{CYAN}Enter the number of the order to verify, 'm' for manual input, or 'n' to cancel: {RESET}").lower()
17837        else:
17838            choice = 'm'
17839
17840        if choice == 'n':
17841            return None, None, None
17842        elif choice == 'm':
17843            video_id = input(f"{CYAN}Enter video ID: {RESET}")
17844            miner_hotkey = input(f"{CYAN}Enter miner hotkey: {RESET}")
17845            block_hash = input(f"{CYAN}Enter block hash: {RESET}")
17846            return video_id, miner_hotkey, block_hash
17847        elif choice.isdigit():
17848            idx = int(choice) - 1
17849            if 0 <= idx < len(purchases):
17850                selected = purchases[idx]
17851                return selected['video_id'], selected.get('miner_hotkey', ''), selected['block_hash']
17852            else:
17853                print(f"{RED}Invalid selection. Please try again.{RESET}")
17854        else:
17855            print(f"{RED}Invalid input. Please try again.{RESET}")
17856
17857def select_order_for_full_display(purchases):
17858    while True:
17859        choice = input(f"{CYAN}Enter the number of the order to see full details, or 'n' to return to menu: {RESET}").lower()
17860        
17861        if choice == 'n':
17862            return
17863        elif choice.isdigit():
17864            idx = int(choice) - 1
17865            if 0 <= idx < len(purchases):
17866                selected = purchases[idx]
17867                # Display full details
17868                print(f"\n{CYAN}Order Details:{RESET}")
17869                print(f"Video ID: {selected['video_id']}")
17870                print(f"Purchase State: {selected['state']}")
17871                print(f"Cost (TAO): {selected.get('amount', 'N/A')}")
17872                print(f"Estimated Reward (TAO): {float(selected.get('amount', 0)) / 0.9:.5f}")
17873                print(f"Purchasing Hotkey: {selected.get('miner_hotkey', 'N/A')}")
17874                print(f"Block Hash: {selected['block_hash']}")
17875                print(f"Purchase Date: {selected.get('created_at', 'N/A')}")
17876                return
17877            else:
17878                print(f"{RED}Invalid selection. Please try again.{RESET}")
17879        else:
17880            print(f"{RED}Invalid input. Please try again.{RESET}")
17881
17882def verify_purchase(video_id=None, miner_hotkey=None, block_hash=None):
17883    if not all([video_id, miner_hotkey, block_hash]):
17884        video_id, miner_hotkey, block_hash = select_order_for_verification()
17885        if not all([video_id, miner_hotkey, block_hash]):
17886            print(f"{CYAN}Verification cancelled.{RESET}")
17887            return
17888
17889    print(f"Verifying purchase for video {video_id} on block hash {block_hash} ...")
17890
17891    retries = 3
17892    for attempt in range(retries):
17893        try:
17894            verify_response = requests.post(
17895                API_BASE + "/api/focus/verify-purchase",
17896                json={"miner_hotkey": miner_hotkey, "video_id": video_id, "block_hash": block_hash},
17897                headers={"Content-Type": "application/json"},
17898                timeout=90
17899            )
17900            print(f"Purchase verification response for video {video_id}:", verify_response.text)
17901            if verify_response.status_code == 200:
17902                print(f"{GREEN}Purchase verified successfully!{RESET}")
17903                save_purchase_info(video_id, miner_hotkey, block_hash, "verified")
17904                return True
17905            
17906            if attempt < retries - 1:
17907                print(f"{CYAN}Attempt #{attempt + 1} to verify purchase failed. Retrying in 2 seconds...{RESET}")
17908                time.sleep(2)
17909        except Exception as e:
17910            if attempt < retries - 1:
17911                print(f"{CYAN}Attempt #{attempt + 1} to verify purchase failed. Retrying in 2 seconds...{RESET}")
17912                print(f"{RED}Error: {str(e)}{RESET}")
17913                time.sleep(2)
17914            else:
17915                print(f"{RED}All {retries} attempts failed. Unable to verify purchase.{RESET}")
17916                return False
17917
17918def save_purchase_info(video_id, hotkey, block_hash, state, amount=None):
17919    purchases_file = os.path.expanduser("~/.omega/focus_videos.json")
17920    os.makedirs(os.path.dirname(purchases_file), exist_ok=True)
17921    
17922    purchases = []
17923    if os.path.exists(purchases_file):
17924        with open(purchases_file, 'r') as f:
17925            purchases = json.load(f)
17926    
17927    # Check if the video_id already exists
17928    for purchase in purchases:
17929        if purchase['video_id'] == video_id:
17930            purchase['state'] = state
17931            purchase['miner_hotkey'] = hotkey
17932            purchase['block_hash'] = block_hash
17933            if amount is not None:
17934                purchase['amount'] = amount
17935            break
17936    else:
17937        # If the video_id doesn't exist, create a new entry
17938        new_purchase = {
17939            "video_id": video_id,
17940            "miner_hotkey": hotkey,
17941            "block_hash": block_hash,
17942            "state": state,
17943            "created_at": datetime.now().isoformat()  # Add creation timestamp
17944        }
17945        if amount is not None:
17946            new_purchase['amount'] = amount
17947        purchases.append(new_purchase)
17948    
17949    with open(purchases_file, 'w') as f:
17950        json.dump(purchases, f, indent=2)
17951    
17952    print(f"{GREEN}Purchase information {'updated' if state == 'verified' else 'saved'} to {purchases_file}{RESET}")
17953
17954def main():
17955    while True:
17956        print(f"\n{CYAN}Welcome to the OMEGA Focus Videos Purchase System{RESET}")
17957        print("1. View + Purchase Focus Videos")
17958        print("2. Manually Purchase Focus Video")
17959        print("3. Verify Purchase")
17960        print("4. Display Order History")
17961        print("5. Exit")
17962        
17963        choice = input(f"{CYAN}Enter your choice (1-5): {RESET}")
17964        
17965        if choice == '1':
17966            videos_data = list_videos()
17967            if videos_data:
17968                display_videos(videos_data)
17969                purchase_option = input(f"\n{CYAN}Enter the number of the video you want to purchase or press 'n' to return to menu: {RESET}").lower()
17970                if purchase_option.isdigit():
17971                    video_index = int(purchase_option) - 1
17972                    if 0 <= video_index < len(videos_data):
17973                        purchase_video(videos_data[video_index]['video_id'])
17974                    else:
17975                        print(f"{RED}Invalid video number.{RESET}")
17976                elif purchase_option != 'n':
17977                    print(f"{RED}Invalid input. Returning to main menu.{RESET}")
17978            else:
17979                print(f"\n{RED}No videos available for purchase at this time.{RESET}")
17980        elif choice == '2':
17981            purchase_video()
17982        elif choice == '3':
17983            verify_purchase()
17984        elif choice == '4':
17985            purchases = display_saved_orders()
17986            select_order_for_full_display(purchases)
17987        elif choice == '5':
17988            print(f"{GREEN}Thank you for using the OMEGA Focus Videos Purchase System. Goodbye!{RESET}")
17989            break
17990        else:
17991            print(f"{RED}Invalid choice. Please try again.{RESET}")
17992
17993if __name__ == "__main__":
17994    try:
17995        multiprocessing.freeze_support()
17996        main()
17997    except KeyboardInterrupt:
17998        print("\nScript interrupted by user. Exiting.")
17999        reset_terminal()
18000        sys.exit(0)
18001    except Exception as e:
18002        print(f"\nAn unexpected error occurred: {str(e)}")
18003        reset_terminal()
18004        sys.exit(1)
18005
18006
18007
18008---
18009File: /README.md
18010---
18011
18012<div align="center">
18013
18014# OMEGA Labs Bittensor Subnet: The World's Largest Decentralized AGI Multimodal Dataset <!-- omit in toc -->
18015[](https://omegatron.ai)
18016[](https://opensource.org/licenses/MIT) 
18017
18018---
18019
18020## Be, and it becomes ... <!-- omit in toc -->
18021
18022</div>
18023
18024---
18025- [Introduction](#introduction)
18026- [Key Features](#key-features)
18027- [Miner and Validator Functionality](#miner-and-validator-functionality)
18028  - [Miner](#miner)
18029  - [Validator](#validator)
18030- [Roadmap](#roadmap)
18031- [Running Miners and Validators](#running-miners-and-validators)
18032  - [Running a Miner](#running-a-miner)
18033  - [Running a Validator](#running-a-validator)
18034- [Contributing](#contributing)
18035- [License](#license)
18036
18037---
18038## Introduction
18039
18040Welcome to the OMEGA Labs Bittensor subnet, a groundbreaking initiative that aims to create the world's largest decentralized multimodal dataset for accelerating Artificial General Intelligence (AGI) research and development. Our mission is to democratize access to a vast and diverse dataset that captures the landscape of human knowledge and creation, empowering researchers and developers to push the boundaries of AGI.
18041
18042By harnessing the power of the Bittensor network and a global community of miners and validators, we are building a dataset that surpasses the scale and diversity of existing resources. With over 1 million hours of footage and 30 million+ 2-minute video clips, the OMEGA Labs dataset will enable the development of powerful AGI models and transform various industries.
18043
18044
18045## Key Features
18046
18047- 🌍 **Unparalleled Scale and Diversity**: 1 million+ hours of footage, 30 million+ video clips, covering 50+ scenarios and 15,000+ action phrases.
18048- 🧠 **Latent Representations**: Leveraging state-of-the-art models to translate video components into a unified latent space for efficient processing.
18049- 💰 **Incentivized Data Collection**: Rewarding miners for contributing high-quality, diverse, and novel videos through a decentralized network.
18050- 🤖 **Empowering Digital Agents**: Enabling the development of intelligent agents that can navigate complex workflows and assist users across platforms.
18051- 🎮 **Immersive Gaming Experiences**: Facilitating the creation of realistic gaming environments with rich physics and interactions.
18052
18053## Miner and Validator Functionality
18054
18055### Miner
18056
18057- Performs a simple search on YouTube and retrieves 8 videos at a time.
18058- Provides a certain clip range (maximum of 2 minutes) and a description (catch) which includes the title, tags, and description of the video.
18059- Obtains the ImageBind embeddings for the video, audio, and caption.
18060- Returns the video ID, caption, ImageBind embeddings (video, audio, caption embeddings), and start and end times for the clips (maximum of 2 minutes).
18061
18062### Validator
18063
18064- Takes the received videos from the miners and randomly selects one video for validation.
18065- Computes the ImageBind embeddings for all three modalities (video, audio, caption) of the selected video.
18066- Compares the quality of the embeddings to ensure they are consistent with the miner's submissions.
18067- If the selected video passes the validation, assumes all eight videos from the miner are valid.
18068- Scores the videos based on relevance, novelty, and detail richness:
18069  - Relevance: Calculated using cosine similarity between the topic embedding and each of the eight videos.
18070  - Novelty: For each video, finds the closest video in the Pinecone index and computes 1 - similarity.
18071    - Potential issue: Choosing the second most similar video instead of the most similar one.
18072  - Detail Richness: Determined by the cosine similarity between the text and video embeddings.
18073- Collects 1024 validated video entries and pushes them to Hugging Face as a file, which is then concatenated.
18074  - If a miner submits too frequently, the validator may increase the file threshold accumulation limit.
18075  - If the API needs to shut down for any reason, it will submit the remaining validated entries.
18076
18077## SN24: Ω Focus Videos Submission
18078
18079We're excited to introduce a new feature in the SN24 ecosystem: the Focus Video submission and reward process. This system creates a robust marketplace for task-completion videos, leveraging the strengths of the Bittensor network. Here's how it works:
18080
18081### The Players
180821. Ω Focus users: Individuals who complete tasks and record their work
180832. SN24 miners: Network participants who can purchase Focus videos
180843. SN24 validators: Entities that validate and score submissions
180854. Ω Brain: Ω Focus's backend API that processes submissions
18086
18087### The Process
18088
18089#### 1. Task Completion and Recording
18090Ω Focus users create tasks for themselves within the app. They then complete these tasks while screen recording their work via the app.
18091
18092#### 2. Submission and Initial Processing
18093Once a task is completed, the user's screen recording and task metadata are uploaded to Ω Brain. This backend system processes the recording, extracting metadata and combining partial clips if necessary.
18094
18095#### 3. Scoring
18096Ω Brain forwards the processed video to the SN24 validator API. The validator scores the submission based on predefined criteria. To learn more about the scoring algorithm, check out [this section](#scoring-algorithm) below.
18097
18098#### 4. User Notification and Marketplace Listing
18099The Ω Focus user receives their score and an estimate of the potential TAO reward. They can then choose to submit their video to the SN24 Focus Videos marketplace.
18100
18101#### 5. Miner Purchase
18102SN24 miners can browse and purchase videos from the marketplace. To make a purchase, a miner notifies the SN24 validator API of their intent. The API informs the miner of the TAO amount to transfer to the Ω Focus user's wallet. [Code here](https://github.com/omegalabsinc/omegalabs-bittensor-subnet/blob/focus_app_v1_integration/purchase_focus_video.py)
18103
18104#### 6. Transaction Verification
18105Once the miner transfers the TAO, they provide the transaction's block hash to the SN24 validator API. The API then verifies this transaction on the Bittensor chain's public ledger. [Code here](https://github.com/omegalabsinc/omegalabs-bittensor-subnet/blob/8ecf61b5846e2eb226aaa30f01e23df850f3c435/validator-api/validator_api/cron/confirm_purchase.py#L55)
18106
18107#### 7. Miner Scoring and Reimbursement
18108SN24 validators, while sending their YouTube scraping requests to miners, also check with the validator API to see if miners have purchased Focus Videos. Miners' scores are adjusted based on these purchases. Via validators increasing the miners' scores for purchasing videos from the marketplace, the Bittensor chain effectively then reimburses miners for their Focus Video purchases over the following 24-hour period. [Code here](https://github.com/omegalabsinc/omegalabs-bittensor-subnet/blob/8ecf61b5846e2eb226aaa30f01e23df850f3c435/omega/base/validator.py#L322-L326)
18109
18110#### 8. Impact on Miner Scores
18111Focus Video scores currently make up 2.5% of a miner's total SN24 score. We plan to increase this percentage as the system proves successful.
18112
18113#### 9. Video Availability for External Buyers
18114Once a Focus Video submission is marked as COMPLETED (which happens when a miner transfers TAO to the Ω Focus user), the video becomes available for purchase by external data buyers, such as AI research labs. (Note: This feature will be implemented in the future.)
18115
18116### Benefits
18117- Users are incentivized to complete and record valuable tasks
18118- Miners can improve their scores by purchasing high-quality Focus Videos
18119- The network gains a new source of verified, high-quality data
18120- External entities will gain access to a marketplace of task-completion videos
18121
18122We believe this system will create a vibrant ecosystem within SN24, driving value for all participants while generating useful data for the broader AI community. We're starting with a conservative 2.5% score impact for Focus Videos, but we're excited to see how this new feature develops and grows within our network.
18123
18124```mermaid
18125flowchart TD
18126    A["👤 Ω Focus User"] -->|"1️⃣ Complete task & record"| B
18127    B["🧠 Ω Brain"] -->|"2️⃣ Process video"| C
18128    C{"🛡️ SN24 Validator API"}
18129    C -->|"3️⃣ Score submission"| A
18130    A -->|"4️⃣ List video"| E["🎥 Focus Videos Marketplace"]
18131    F["⛏️ SN24 Miner"] -->|"5️⃣ Purchase video"| E
18132    F -->|"6️⃣ Transfer TAO"| G["💰 User Wallet"]
18133    F -.->|"7️⃣ Provide tx hash"| C
18134    C -.->|"8️⃣ Verify transaction"| I
18135    I["🔍 SN24 Validator"] -.->|"9️⃣ Check purchases & set weights"| H{"⛓️ Bittensor Chain"}
18136    H -.->|"🔟 Reimburse miners"| F
18137
18138    classDef user fill:#30336b,stroke:#333,stroke-width:2px,color:white;
18139    classDef brain fill:#eeac99,stroke:#333,stroke-width:2px,color:white;
18140    classDef api fill:#e06377,stroke:#333,stroke-width:2px,color:white;
18141    classDef market fill:#c83349,stroke:#333,stroke-width:2px,color:white;
18142    classDef miner fill:#5b9aa0,stroke:#333,stroke-width:2px,color:white;
18143    classDef validator fill:#f0932b,stroke:#333,stroke-width:2px,color:white;
18144    classDef chain fill:#6ab04c,stroke:#333,stroke-width:2px,color:white;
18145    classDef external fill:#61c0bf,stroke:#333,stroke-width:2px,color:white;
18146
18147    class A user;
18148    class B brain;
18149    class C api;
18150    class D,E market;
18151    class F miner;
18152    class G user;
18153    class H chain;
18154    class I validator;
18155    class J external;
18156```
18157
18158### Scoring Algorithm
18159
18160A task completion video's final score is a geometric average of five components:
18161
18162#### gemini based scores
181631. task_gemini_score: Gemini's evaluation of the task's quality, based on the task overview and how it feeds into the community's goals and its relevance to teaching AI systems ([prompt](https://github.com/omegalabsinc/omegalabs-bittensor-subnet/blob/8ecf61b5846e2eb226aaa30f01e23df850f3c435/validator-api/validator_api/services/focus_scoring_prompts.py#L2))
181642. completion_gemini_score: Gemini's evaluation of how well the task was completed and how relevant the video content is to the task and the community's goals ([prompt](https://github.com/omegalabsinc/omegalabs-bittensor-subnet/blob/8ecf61b5846e2eb226aaa30f01e23df850f3c435/validator-api/validator_api/services/focus_scoring_prompts.py#L88))
18165
18166#### embeddding based scores
181673. task_uniqueness_score: Uniqueness of the task based on embedding similarity of the task overview with existing tasks in the system
181684. description_uniqueness_score: Uniqueness of the video description based on embedding similarity of the detailed video description with existing video annotations in the system
181695. video_uniqueness_score: Uniqueness of the video content based on embedding similarity of the video with existing videos in the system
18170
18171Each component contributes equally to the final score. We chose to use a geometric average to ensure that no individual component dominates the final score.
18172
18173You can dig into the code implementation [here](https://github.com/omegalabsinc/omegalabs-bittensor-subnet/blob/8ecf61b5846e2eb226aaa30f01e23df850f3c435/validator-api/validator_api/services/scoring_service.py#L240).
18174
18175### Why so Complicated?
18176
18177Anyone experienced with Bittensor is probably asking themselves right now: why is this video submission process so convoluted? Why not just have Ω Focus users be miners and be compensated directly via the Bittensor chain's emissions each epoch? There are a few reasons:
18178
181791. Bittensor’s emissions system awards miners constantly (every epoch), and miners who do not perform well are eventually deregistered and must buy in again (optimized for consistently high performance and throughput). We see Ω Focus users completing tasks and submitting their screen recordings with irregular schedules (some days you do awesome work, some days you rest). With less consistent schedules, we don’t want temporarily inactive users to be deregistered (and subsequently have to re-register to start earning again).
181802. Therefore, Ω Labs and SN24 miners acts as intermediaries. Ω Focus users complete tasks and submit their recordings on an arbitrary schedule while SN24 miners are consistently buying up available screen recordings and submitting them to SN24 validators for verification.
181813. Once smart contracts are available on Bittensor, as Const mentioned recently, we will definitely move over to emitting rewards directly to Focus users in a fully decentralized manner.
18182
18183### Hmmm, this doesn't feel like it's fully decentralized
18184
18185Yes, we acknowledge that. Even while Smart Contracts are not available on Bittensor, there is still room for us to decentralize the scoring and purchase verification process further. Some next steps here include:
18186
181871. Use some decentralized database to store the Focus Video scores, annotations, and purchase status.
181882. Move the scoring to run locally on the validator's machines via opensource video understanding models like Qwen2-VL-72b when it becomes available or by simply having validators make requests to the Gemini API themselves in the meantime.
181893. Creating a public dashboard where anyone in the Bittensor community can view the Focus Videos and their associated scores to judge for themselves the quality of the submissions.
18190
18191All in all, this is an MVP release and we wanted to just ship something out to get the ball rolling. We are 100% committed to decentralizing the system as much as possible urgently, but also want to emphasize the novel nature of what we're implementing here and appreciate everyone's patience as we make the system more robust and decentralized.
18192
18193Learn more about the Ω Focus app in [this FAQ](https://focus.omega.inc).
18194
18195## Roadmap
18196
18197### Phase 1: Foundation (Q1 2024)
18198- [x] Launch OMEGA Labs subnet on Bittensor testnet
18199- [x] Reach 100,000 hours of footage and 3 million video clips
18200
18201### Phase 2: Expansion (Q2 2024)
18202- [x] Reach 250,000 hours of footage and 15 million video clips
18203- [x] Train and demo any-to-any models on the dataset
18204- [ ] Build synthetic data pipelines to enhance dataset quality
18205- [ ] Publish a research paper on the Bittensor-powered Ω AGI dataset
18206- [ ] Expand into running inference for state-of-the-art any-to-any multimodal models
18207
18208### Phase 3: Refinement (Q3 2024)
18209- [ ] Reach 500,000+ hours of footage and 30 million+ video clips
18210- [ ] Use the dataset to train powerful unified representation models
18211- [ ] Fine-tune any-to-any models for advanced audio-video synchronized generation
18212- [ ] Open up an auctioning page for companies and groups to bid on validation topics using various currencies (in addition to TAO)
18213- [ ] Develop state-of-the-art video processing models for applications such as:
18214  - Transcription
18215  - Motion analysis
18216  - Object detection and tracking
18217  - Emotion recognition
18218
18219### Phase 4: Application (Q4 2024)
18220- [ ] Train desktop & mobile action prediction models on the dataset
18221- [ ] Develop cross-platform digital agents MVP
18222
18223### Phase 5: Democratization (Q1 2025)
18224- [ ] Generalize the subnet for miners to upload videos from any data source
18225- [ ] Incentivize people to record and label their own data using non-deep learning approaches
18226
18227## Running Miners and Validators
18228### Running a Miner
18229#### Requirements
18230- Python 3.8+
18231- Pip
18232- GPU with at least 12 GB of VRAM or 24 GB if you'd like to run a local LLM
18233- If running on runpod, `runpod/pytorch:2.2.1-py3.10-cuda12.1.1-devel-ubuntu22.04` is a good base template.
18234
18235#### Setup
182361. To start, clone the repository and `cd` to it:
18237```bash
18238git clone https://github.com/omegalabsinc/omegalabs-bittensor-subnet.git
18239cd omegalabs-bittensor-subnet
18240```
182412. Install ffmpeg. If you're on Ubuntu, just run: `apt-get -y update && apt-get install -y ffmpeg`.
182423. Install pm2 if you don't already have it: [pm2.io](https://pm2.io/docs/runtime/guide/installation/).
182434. Next, install the `omega` package: `pip install -e .`
18244
18245#### Run with PM2
18246```bash
18247pm2 start neurons/miner.py --name omega-miner -- \
18248    --netuid {netuid} \
18249    --wallet.name {wallet} \
18250    --wallet.hotkey {hotkey} \
18251    --axon.port {port} \
18252    --blacklist.force_validator_permit
18253```
18254
18255#### Tips for Better Incentive
18256The subnet has become quite competitive, and the basic miner template is no longer sufficient to earn good emissions and avoid deregistration. Here are some tips to consider improving your miner:
182571. Use proxies or frequently change your pod.
18258  a) We've heard good things about [Storm Proxies](https://stormproxies.com/).
182592. Make sure your videos are unique. You can de-duplicate your collected video with this [video ID index](https://huggingface.co/datasets/jondurbin/omega-multimodal-ids) graciously offered by Jon, one of the miners on the OMEGA subnet.
182603. Improve the descriptions you are submitting alongside your uploaded videos. You can try doing this by using video captioning models or incorporating the transcript. Lots of experimentation room here.
182614. You can use the `check_score` endpoint that we offer to check your score breakdown. See [this gist](https://gist.github.com/salmanshah1d/f5a8e83cb4af6444ffdef4325a59b489).
18262
18263#### Common Troubleshooting Tips
182641. If you've been running for several minutes and have not received any requests, make sure your port is open to receiving requests. You can try hitting your IP and port with `curl`. If you get no response, it means your port is not open.
182652. You can use our [validator logs W&B](https://wandb.ai/omega-labs/omega-sn24-validator-logs) to see how your miner is scoring in practice.
18266
18267### Running a Validator
18268#### Requirements
18269- Python 3.8+
18270- Pip
18271- GPU with at least 24 GB of VRAM
18272- If running on runpod, `runpod/pytorch:2.2.1-py3.10-cuda12.1.1-devel-ubuntu22.04` is a good base template.
18273
18274#### Recommended
18275- Setting up wandb. Set environment variable with `export WANDB_API_KEY=<your API key>`. Alternatively, you can disable W&B with --wandb.off
18276
18277#### Setup
182781. To start, clone the repository and `cd` to it:
18279```bash
18280git clone https://github.com/omegalabsinc/omegalabs-bittensor-subnet.git
18281cd omegalabs-bittensor-subnet
18282```
182832. Install ffmpeg. If you used the runpod image recommended above, ffmpeg is already installed. Otherwise, if you're on Ubuntu, just run: `apt-get -y update && apt-get install -y ffmpeg`.
182843. Install pm2 if you don't already have it: [pm2.io](https://pm2.io/docs/runtime/guide/installation/).
182854. Next, install the `omega` package: `pip install -e .`
18286
18287#### Run auto-updating validator with PM2 (recommended)
18288```bash
18289pm2 start auto_updating_validator.sh --name omega-validator -- \
18290    --netuid {netuid} \
18291    --wallet.name {wallet} \
18292    --wallet.hotkey {hotkey} \
18293    --axon.port {port} \
18294    --logging.trace
18295```
18296Note: you might need to adjust "python" to "python3" within the `neurons/auto_updating_validator.sh` depending on your preferred system python.
18297
18298#### Run basic validator with PM2
18299```bash
18300pm2 start neurons/validator.py --name omega-validator -- \
18301    --netuid {netuid} \
18302    --wallet.name {wallet} \
18303    --wallet.hotkey {hotkey} \
18304    --axon.port {port} \
18305    --logging.trace
18306```
18307
18308## Contributing
18309
18310We believe in the power of community and collaboration. Join us in building the world's largest decentralized multimodal dataset for AGI research! Whether you're a researcher, developer, or data enthusiast, there are many ways to contribute:
18311
18312- Submit high-quality videos and annotations
18313- Develop and improve data validation and quality control mechanisms
18314- Train and fine-tune models on the dataset
18315- Create applications and tools that leverage the dataset
18316- Provide feedback and suggestions for improvement
18317
18318To get started, please see our [contribution guidelines](./CONTRIBUTING.md) and join our vibrant community on [Discord](https://discord.gg/opentensor).
18319
18320## License
18321
18322The OMEGA Labs Bittensor subnet is released under the [MIT License](./LICENSE).
18323
18324---
18325
18326🌟 Together, let's revolutionize AGI research and unlock the full potential of multimodal understanding! 🌟
18327</div>
18328
18329
18330
18331---
18332File: /setup.py
18333---
18334
18335# The MIT License (MIT)
18336# Copyright © 2023 Yuma Rao
18337# Copyright © 2023 Omega Labs, Inc.
18338
18339# Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated
18340# documentation files (the “Software”), to deal in the Software without restriction, including without limitation
18341# the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software,
18342# and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
18343
18344# The above copyright notice and this permission notice shall be included in all copies or substantial portions of
18345# the Software.
18346
18347# THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO
18348# THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
18349# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
18350# OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
18351# DEALINGS IN THE SOFTWARE.
18352
18353import re
18354import os
18355import codecs
18356from os import path
18357from io import open
18358from setuptools import setup, find_packages
18359
18360
18361def read_requirements(path):
18362    with open(path, "r") as f:
18363        requirements = f.read().splitlines()
18364    return requirements
18365
18366
18367requirements = read_requirements("requirements.txt")
18368here = path.abspath(path.dirname(__file__))
18369
18370with open(path.join(here, "README.md"), encoding="utf-8") as f:
18371    long_description = f.read()
18372
18373# loading version from setup.py
18374with codecs.open(
18375    os.path.join(here, "omega/__init__.py"), encoding="utf-8"
18376) as init_file:
18377    version_match = re.search(
18378        r"^__version__ = ['\"]([^'\"]*)['\"]", init_file.read(), re.M
18379    )
18380    version_string = version_match.group(1)
18381
18382setup(
18383    name="omega_bittensor_subnet",
18384    version=version_string,
18385    description="omega_bittensor_subnet",
18386    long_description=long_description,
18387    long_description_content_type="text/markdown",
18388    url="https://github.com/omegalabsinc/omega-bittensor-subnet",
18389    author="Omega Labs, Inc.",
18390    packages=find_packages(),
18391    include_package_data=True,
18392    author_email="[email protected]",
18393    license="MIT",
18394    python_requires=">=3.8",
18395    install_requires=requirements,
18396    classifiers=[
18397        "Development Status :: 3 - Alpha",
18398        "Intended Audience :: Developers",
18399        "Topic :: Software Development :: Build Tools",
18400        "License :: OSI Approved :: MIT License",
18401        "Programming Language :: Python :: 3 :: Only",
18402        "Programming Language :: Python :: 3.8",
18403        "Programming Language :: Python :: 3.9",
18404        "Programming Language :: Python :: 3.10",
18405        "Topic :: Scientific/Engineering",
18406        "Topic :: Scientific/Engineering :: Mathematics",
18407        "Topic :: Scientific/Engineering :: Artificial Intelligence",
18408        "Topic :: Software Development",
18409        "Topic :: Software Development :: Libraries",
18410        "Topic :: Software Development :: Libraries :: Python Modules",
18411    ],
18412)
18413
18414
18415
18416---
18417File: /test_audio_dataset.py
18418---
18419
18420import os
18421from datasets import load_dataset
18422from huggingface_hub import login
18423from io import BytesIO
18424import soundfile as sf
18425
18426# Set HF_TOKEN environment variable or pass directly
18427HF_TOKEN = os.getenv('HF_TOKEN')
18428
18429# Login to Hugging Face
18430# login(token=HF_TOKEN)
18431
18432# Load the dataset
18433dataset = load_dataset("tezuesh/diarization_dataset", token=HF_TOKEN)
18434
18435print(f"Dataset loaded successfully with {len(dataset)} examples")
18436# Get first row from the dataset
18437first_row = dataset['train'][0]
18438print("\nFirst row of dataset:")
18439# print(first_row)
18440print("\nKeys in dataset:")
18441print("\nLength of values in first row:")
18442for key in first_row.keys():
18443    if isinstance(first_row[key], list):
18444        print(f"{key}: {len(first_row[key])}")
18445    else:
18446        print(f"{key}: {first_row[key]}")
18447
18448
18449
18450import librosa
18451import numpy as np
18452breakpoint()
18453audio_bytes = first_row['audio_bytes']
18454audio_arr, sr = sf.read(BytesIO(audio_bytes))
18455print(len(audio_arr), type(audio_arr))
18456audio = np.array(audio_arr)
18457# exit()
18458print(audio.shape)
18459
18460youtube_id = first_row['youtube_id']
18461os.makedirs('Dataset_audios/Original', exist_ok=True)
18462sf.write(f'Dataset_audios/Original/{youtube_id}.wav', audio, sr)
18463
18464diar_timestamps_start = first_row['diar_timestamps_start']
18465diar_timestamps_end = first_row['diar_timestamps_end']
18466diar_speakers = first_row['diar_speakers']
18467
18468for start, end, speaker in zip(diar_timestamps_start, diar_timestamps_end, diar_speakers):
18469    # Calculate start and end samples
18470    start_sample = int(start * sr)
18471    end_sample = int(end * sr)
18472    
18473    # Extract the clip
18474    clip = audio[start_sample:end_sample]
18475    
18476    # Create output directory if it doesn't exist
18477    os.makedirs(f'Dataset_audios/Clips/{youtube_id}', exist_ok=True)
18478    
18479    # Save the clip with speaker and timestamp info in filename
18480    clip_filename = f'Dataset_audios/Clips/{youtube_id}/speaker_{speaker}_{start:.2f}-{end:.2f}.wav'
18481    sf.write(clip_filename, clip, sr)
18482    
18483
18484# Create a list to store the diarization data
18485diarization_data = []
18486for start, end, speaker in zip(diar_timestamps_start, diar_timestamps_end, diar_speakers):
18487    diarization_data.append({
18488        'youtube_id': youtube_id,
18489        'start_time': start,
18490        'end_time': end, 
18491        'speaker': speaker,
18492        "duration": end - start
18493    })
18494
18495# Convert to pandas DataFrame and save as CSV
18496import pandas as pd
18497df = pd.DataFrame(diarization_data)
18498os.makedirs('Dataset_audios/Metadata', exist_ok=True)
18499df.to_csv(f'Dataset_audios/Metadata/{youtube_id}_diarization.csv', index=False)
18500