index.html 4.6 KB
<html lang="en">

<!--
The UI code is modified from
https://huggingface.co/spaces/Banafo/Kroko-Streaming-ASR-Wasm
-->

<head>
  <meta charset="utf-8">
  <meta name="viewport" content="width=device-width" />
  <title>Next-gen Kaldi WebAssembly with sherpa-onnx for speech enhancement</title>
  <style>
    h1,div {
      text-align: center;
    }
    textarea {
      width:100%;
    }
    .loading {
      display: none !important;
    }
  </style>
</head>

<body>
  <h1>
    Next-gen Kaldi + WebAssembly<br/>
    Speech Enhancement with <a href="https://github.com/k2-fsa/sherpa-onnx">sherpa-onnx</a><br/>
    using <a href="https://github.com/Xiaobin-Rong/gtcrn">GTCRN</a>
  </h1>

  <div id="status">Loading...</div>

  <div id="singleAudioContent" class="tab-content loading">
    <div style="display: flex; gap: 1.5rem;">
      <!-- Input Section -->
      <div style="flex: 1; display: flex; flex-direction: column; gap: 1rem;">
        <div style="font-size: 1rem; font-weight: bold; padding: 0.5rem 1rem; background-color: #f8f9fa; border-radius: 8px; display: flex; align-items: center; gap: 0.5rem; color: #6c757d;">
          <span style="line-height: 1;">🎵</span> Input
        </div>

        <!-- Drag and Drop / File Upload -->
        <div id="dropzone" style="border: 2px dashed #ced4da; border-radius: 8px; padding: 2rem; text-align: center; color: #6c757d; cursor: pointer; background-color: #f8f9fa; transition: background-color 0.3s, border-color 0.3s; position: relative;">
          <input type="file" id="fileInput" accept=".wav" style="position: absolute; top: 0; left: 0; opacity: 0; width: 100%; height: 100%; cursor: pointer;" />
          <p style="margin: 0;">Drop Audio Here (*.wav)<br>- or -<br>Click to Upload</p>
        </div>
        <audio id="inAudioPlayback" controls style="display: none; margin-top: 1rem; width: 100%;"></audio>
      </div>
    </div>

    <div style="display: flex; gap: 1.5rem;">
      <!-- Output Section -->
      <div style="flex: 1; display: flex; flex-direction: column; gap: 1rem;">
        <div style="font-size: 1rem; font-weight: bold; padding: 0.5rem 1rem; background-color: #f8f9fa; border-radius: 8px; display: flex; align-items: center; gap: 0.5rem; color: #6c757d;">
        <span style="line-height: 1;">🎵</span> Output
      </div>
        <audio id="outAudioPlayback" controls style="display: none; margin-top: 1rem; width: 100%;"></audio>
    </div>
  </div>

  <!-- Footer Section -->
  <div style="width: 100%; max-width: 900px; margin-top: 1.5rem; background: #fff; padding: 1.5rem; border-radius: 8px; box-shadow: 0 2px 8px rgba(0, 0, 0, 0.1); text-align: left; font-size: 0.9rem; color: #6c757d;">
    <h3>Description</h3>
    <ul>
      <li>Everything is <strong>open-sourced.</strong> <a href="https://github.com/k2-fsa/sherpa-onnx">code</a></li>
      <li>The model is from <a href="https://github.com/Xiaobin-Rong/gtcrn">GTCRN</a></li>
      <li>Please upload .wav files</li>
        <ul>
          <li>You can download noisy test wave files from <a href="https://htmlpreview.github.io/?https://github.com/Xiaobin-Rong/gtcrn_demo/blob/main/index.html">https://htmlpreview.github.io/?https://github.com/Xiaobin-Rong/gtcrn_demo/blob/main/index.html</a></li>
        </ul>
      <li>If you have any issues, please either <a href="https://github.com/k2-fsa/sherpa-onnx/issues">file a ticket</a> or contact us via</li>
        <ul>
          <li><a href="https://k2-fsa.github.io/sherpa/social-groups.html#wechat">WeChat group</a></li>
          <li><a href="https://k2-fsa.github.io/sherpa/social-groups.html#qq">QQ group</a></li>
          <li><a href="https://k2-fsa.github.io/sherpa/social-groups.html#bilibili-b">Bilibili</a></li>
        </ul>
    </ul>
    <h3>About This Demo</h3>
    <ul>
      <li><strong>Private and Secure:</strong> All processing is done locally on your device (CPU) within your browser with a single thread. No server is involved, ensuring privacy and security. You can disconnect from the Internet once this page is loaded.</li>
      <li><strong>Efficient Resource Usage:</strong> No GPU is required, leaving system resources available for webLLM analysis.</li>
    </ul>
    <h3>Latest Update</h3>
    <ul>
      <li>First working version.</li>
    </ul>

    <h3>Acknowledgement</h3>
    <ul>
      <li>We refer to <a href="https://huggingface.co/spaces/Banafo/Kroko-Streaming-ASR-Wasm">https://huggingface.co/spaces/Banafo/Kroko-Streaming-ASR-Wasm</a> for the UI part.</li>
    </ul>
  </div>

  <script src="app-speech-enhancement.js"></script>
  <script src="sherpa-onnx-wave.js"></script>
  <script src="sherpa-onnx-speech-enhancement.js"></script>
  <script src="sherpa-onnx-wasm-main-speech-enhancement.js"></script>
</body>